Journal & Issues

Volume 39 (2023): Issue 1 (March 2023)

Volume 38 (2022): Issue 4 (December 2022)
Special Issue on Respondent Burden

Volume 38 (2022): Issue 3 (September 2022)

Volume 38 (2022): Issue 2 (June 2022)

Volume 38 (2022): Issue 1 (March 2022)
Special Issue on Price Indices in Official Statistics

Volume 37 (2021): Issue 4 (December 2021)

Volume 37 (2021): Issue 3 (September 2021)
Special Issue on Population Statistics for the 21st Century

Volume 37 (2021): Issue 2 (June 2021)
Special Issue on New Techniques and Technologies for Statistics

Volume 37 (2021): Issue 1 (March 2021)

Volume 36 (2020): Issue 4 (December 2020)

Volume 36 (2020): Issue 3 (September 2020)
Special Issue on Nonresponse

Volume 36 (2020): Issue 2 (June 2020)

Volume 36 (2020): Issue 1 (March 2020)

Volume 35 (2019): Issue 4 (December 2019)
Special Issue on Measuring LGBT Populations

Volume 35 (2019): Issue 3 (September 2019)

Volume 35 (2019): Issue 2 (June 2019)

Volume 35 (2019): Issue 1 (March 2019)

Volume 34 (2018): Issue 4 (December 2018)

Volume 34 (2018): Issue 3 (September 2018)
Special Section on Responsive and Adaptive Survey Design

Volume 34 (2018): Issue 2 (June 2018)
Special Issue on Establishment Surveys (ICES-V)

Volume 34 (2018): Issue 1 (March 2018)

Volume 33 (2017): Issue 4 (December 2017)

Volume 33 (2017): Issue 3 (September 2017)
Special Issue on Responsive and Adaptive Survey Design

Volume 33 (2017): Issue 2 (June 2017)
Special Issue on Total Survey Error (TSE)

Volume 33 (2017): Issue 1 (March 2017)

Volume 32 (2016): Issue 4 (December 2016)
Special Section on The Role of official Statistics in Statistical Capacity Building

Volume 32 (2016): Issue 3 (September 2016)

Volume 32 (2016): Issue 2 (June 2016)

Volume 32 (2016): Issue 1 (March 2016)

Volume 31 (2015): Issue 4 (December 2015)

Volume 31 (2015): Issue 3 (September 2015)
Special Issue on Coverage Problems in Administrative Sources

Volume 31 (2015): Issue 2 (June 2015)
Special Issue on New Techniques and Technologies for Statistics

Volume 31 (2015): Issue 1 (March 2015)

Volume 30 (2014): Issue 4 (December 2014)
Special Issue on Establishment Surveys

Volume 30 (2014): Issue 3 (September 2014)

Volume 30 (2014): Issue 2 (June 2014)
Special Issue on Surveying the Hard-to-Reach

Volume 30 (2014): Issue 1 (March 2014)

Volume 29 (2013): Issue 4 (December 2013)

Volume 29 (2013): Issue 3 (September 2013)

Volume 29 (2013): Issue 2 (June 2013)

Volume 29 (2013): Issue 1 (March 2013)

Journal Details
Format
Journal
eISSN
2001-7367
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English

Search

Volume 36 (2020): Issue 2 (June 2020)

Journal Details
Format
Journal
eISSN
2001-7367
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English

Search

12 Articles
Open Access

Letter to the Editors

Published Online: 15 Jun 2020
Page range: 229 - 235

Abstract

Open Access

Confidence Intervals of Gini Coefficient Under Unequal Probability Sampling

Published Online: 15 Jun 2020
Page range: 237 - 249

Abstract

Abstract

We propose an estimator for the Gini coefficient, based on a ratio of means. We show how bootstrap and empirical likelihood can be combined to construct confidence intervals. Our simulation study shows the estimator proposed is usually less biased than customary estimators. The observed coverages of the empirical likelihood confidence interval proposed are also closer to the nominal value.

Keywords

  • Bootstrap
  • empirical likelihood
  • inclusion probability
  • survey weight
  • sampling design
Open Access

Estimating Literacy Levels at a Detailed Regional Level: an Application Using Dutch Data

Published Online: 15 Jun 2020
Page range: 251 - 274

Abstract

Abstract

Policy measures to combat low literacy are often targeted at municipalities or regions with low levels of literacy. However, current surveys on literacy do not contain enough observations at this level to allow for reliable estimates when using only direct estimation techniques. To provide more reliable results at a detailed regional level, alternative methods must be used.

The aim of this article is to obtain literacy estimates at the municipality level using model-based small area estimation techniques in a hierarchical Bayesian framework. To do so, we link Dutch Labour Force Survey data to the most recent literacy survey available, that of the Programme for the International Assessment of Adult Competencies (PIAAC). We estimate the average literacy score, as well as the percentage of people with a low literacy level. Variance estimators for our small area predictions explicitly account for the imputation uncertainty in the PIAAC estimates. The proposed estimation method improves the precision of the area estimates, making it possible to break down the national figures by municipality.

Keywords

  • Literacy
  • basic skills
  • municipality
  • region
  • small area estimation
Open Access

Analysing Sensitive Data from Dynamically-Generated Overlapping Contingency Tables

Published Online: 15 Jun 2020
Page range: 275 - 296

Abstract

Abstract

Contingency tables provide a convenient format to publish summary data from confidential survey and administrative records that capture a wide range of social and economic information. By their nature, contingency tables enable aggregation of potentially sensitive data, limiting disclosure of identifying information. Furthermore, censoring or perturbation can be used to desensitise low cell counts when they arise. However, access to detailed cross-classified tables for research is often restricted by data custodians when too many censored or perturbed cells are required to preserve privacy. In this article, we describe a framework for selecting and combining log-linear models when accessible data is restricted to overlapping marginal contingency tables. The approach is demonstrated through application to housing transition data from the Australian Census Longitudinal Data set provided by the Australian Bureau of Statistics.

Keywords

  • Count data
  • log-linear model
  • marginal model
  • privacy restriction
Open Access

Switching Between Different Non-Hierachical Administrative Areas via Simulated Geo-Coordinates: A Case Study for Student Residents in Berlin

Published Online: 15 Jun 2020
Page range: 297 - 314

Abstract

Abstract

The transformation of area aggregates between non-hierarchical area systems (administrative areas) is a standard problem in official statistics. For this problem, we present a proposal which is based on kernel density estimates. The approach applies a modification of a stochastic expectation maximization algorithm, which was proposed in the literature for the transformation of totals on rectangular areas to kernel density estimates. As a by-product of the routine, one obtains simulated geo-coordinates for each unit. With the help of these geo-coordinates, it is possible to calculate case numbers for any area system of interest. The proposed method is evaluated in a design-based simulation based on a close-to-reality, simulated data set with known exact geo-coordinates. In the empirical part, the method is applied to student resident figures from Berlin, Germany. These are known only at the level of ZIP codes, but they are needed for smaller administrative planning districts. Results for (a) student concentration areas and (b) temporal changes in the student residential areas between 2005 and 2015 are presented and discussed.

Keywords

  • Choropleth maps
  • kernel density estimation
  • statistical reporting
  • sub-regional estimation
  • urban development
Open Access

Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal

Published Online: 15 Jun 2020
Page range: 315 - 338

Abstract

Abstract

With the increase of social media usage, a huge new source of data has become available. Despite the enthusiasm linked to this revolution, one of the main outstanding criticisms in using these data is selection bias. Indeed, the reference population is unknown. Nevertheless, many studies show evidence that these data constitute a valuable source because they are more timely and possess higher space granularity. We propose to adjust statistics based on Twitter data by anchoring them to reliable official statistics through a weighted, space-time, small area estimation model. As a by-product, the proposed method also stabilizes the social media indicators, which is a welcome property required for official statistics. The method can be adapted anytime official statistics exists at the proper level of granularity and for which social media usage within the population is known. As an example, we adjust a subjective well-being indicator of “working conditions” in Italy, and combine it with relevant official statistics. The weights depend on broadband coverage and the Twitter rate at province level, while the analysis is performed at regional level. The resulting statistics are then compared with survey statistics on the “quality of job” at macro-economic regional level, showing evidence of similar paths.

Keywords

  • Well-being
  • big data
  • sentiment analysis
  • small area estimation
  • weighting
Open Access

Exploring Mechanisms of Recruitment and Recruitment Cooperation in Respondent Driven Sampling

Published Online: 15 Jun 2020
Page range: 339 - 360

Abstract

Abstract

Respondent driven sampling (RDS) is a sampling method designed for hard-to-sample groups with strong social ties. RDS starts with a small number of arbitrarily selected participants (“seeds”). Seeds are issued recruitment coupons, which are used to recruit from their social networks. Waves of recruitment and data collection continue until reaching a sufficient sample size. Under the assumptions of random recruitment, with-replacement sampling, and a sufficient number of waves, the probability of selection for each participant converges to be proportional to their network size. With recruitment noncooperation, however, recruitment can end abruptly, causing operational difficulties with unstable sample sizes. Noncooperation may void the recruitment Markovian assumptions, leading to selection bias. Here, we consider two RDS studies: one targeting Korean immigrants in Los Angeles and in Michigan; and another study targeting persons who inject drugs in Southeast Michigan. We explore predictors of coupon redemption, associations between recruiter and recruits, and details within recruitment dynamics. While no consistent predictors of noncooperation were found, there was evidence that coupon redemption of targeted recruits was more common among those who shared social bonds with their recruiters, suggesting that noncooperation is more likely to be a feature of recruits not cooperating, rather than recruiters failing to distribute coupons.

Keywords

  • Respondent driven sampling
  • sampling hard-to-reach population
  • nonresponse error
Open Access

Measuring the Sustainable Development Goal Indicators: An Unprecedented Statistical Challenge

Published Online: 15 Jun 2020
Page range: 361 - 378

Abstract

Abstract

In March 2017, the United Nations (UN) Statistical Commission adopted a measurement framework for the UN Agenda 2030 for Sustainable Development, comprising of 232 indicators designed to measure the 17 Sustainable Development Goals (SDGs) and their respective 169 targets. The scope of this measurement framework is so ambitious it led Mogens Lykketoft, President of the seventieth session of the UN General Assembly, to describe it as an ‘unprecedented statistical challenge’.

Naturally, with a programme of this magnitude, there will be foreseen and unforeseen challenges and consequences. This article outlines some of the key differences between the Millennium Development Goals and the SDGs, before detailing some of the measurement challenges involved in compiling the SDG indicators, and examines some of the unanticipated consequences arising from the mechanisms put in place to measure progress from a broad political economy perspective.

Keywords

  • 2030 Agenda
  • unintended consequences
  • national statistical systems
  • administrative data
Open Access

Explaining Inconsistencies in the Education Distributions of Ten Cross-National Surveys – the Role of Methodological Survey Characteristics

Published Online: 15 Jun 2020
Page range: 379 - 409

Abstract

Abstract

Surveys measuring the same concept using the same measure on the same population at the same point in time should result in highly similar results. If this is not the case, this is a strong sign of lacking reliability, resulting in non-comparable data across surveys. Looking at the education variable, previous research has identified inconsistencies in the distributions of harmonised education variables, using the International Standard Classification of Education (ISCED), across surveys within the same countries and years. These inconsistencies are commonly explained by differences in the measurement, especially in the response categories of the education question, and in the harmonisation when classifying country-specific education categories into ISCED. However, other methodological characteristics of surveys, which we regard as ‘containers’ for several characteristics, may also contribute to this finding. We compare the education distributions of nine cross-national surveys with the European Union Labour Force Survey (EU-LFS), which is used as benchmark. This study analyses 15 survey characteristics to better explain the inconsistencies. The results confirm a predominant effect of the measurement instrument and harmonisation. Different sampling designs also explain inconsistencies, but to a lesser degree. Finally, we discuss the results and limitations of the study and provide ideas for improving data comparability.

Keywords

  • Comparative research
  • cross-national surveys
  • survey characteristics
  • education
Open Access

Investigating the Effects of the Household Budget Survey Redesign on Consumption and Inequality Estimates: the Italian Experience

Published Online: 15 Jun 2020
Page range: 411 - 434

Abstract

Abstract

In 2014, many innovations were introduced in the Italian Household Budget Survey (HBS) in response to changes in European recommendations and purchasing behaviours and to an increased demand for information in the context of social and economic research. New instruments and techniques have been introduced, together with more accurate methodologies, with the aim of improving the survey, by both reducing the bias and variance of survey estimates and supplying estimation for additional subpopulations and variables. Given the parallel conduction of the former and new HBS in 2013, it has been possible to evaluate the effects of the abovementioned changes on consumption expenditure and inequality estimates and to compare the sample representativeness of selected subpopulations in both surveys.

Keywords

  • Survey design
  • data quality
  • zero expenditures
  • post-stratification
Open Access

On Accuracy Estimation Using Parametric Bootstrap in small Area Prediction Problems

Published Online: 15 Jun 2020
Page range: 435 - 458

Abstract

Abstract

We consider longitudinal data and the problem of prediction of subpopulation (domain) characteristics that can be written as a linear combination of the variable of interest, including cases of small or zero sample sizes in the domain and time period of interest. We consider the empirical version of the predictor proposed by Royall (1976) showing that it is a generalization of the empirical version of the predictor presented by Henderson (1950). We propose a parametric bootstrap MSE estimator of the predictor. We prove its asymptotic unbiasedness and derive the order of its bias. Considerations are supported by Monte Carlo simulation analyses to compare its accuracy (not only the bias) with other MSE estimators, including jackknife and weighted jackknife MSE estimators that we adapt for the considered predictor.

Keywords

  • Empirical best linear unbiased predictor
  • model approach in survey sampling
  • parametric bootstrap
  • properties of MSE estimators
  • small area estimation
Open Access

Book Review

Published Online: 15 Jun 2020
Page range: 459 - 461

Abstract

12 Articles
Open Access

Letter to the Editors

Published Online: 15 Jun 2020
Page range: 229 - 235

Abstract

Open Access

Confidence Intervals of Gini Coefficient Under Unequal Probability Sampling

Published Online: 15 Jun 2020
Page range: 237 - 249

Abstract

Abstract

We propose an estimator for the Gini coefficient, based on a ratio of means. We show how bootstrap and empirical likelihood can be combined to construct confidence intervals. Our simulation study shows the estimator proposed is usually less biased than customary estimators. The observed coverages of the empirical likelihood confidence interval proposed are also closer to the nominal value.

Keywords

  • Bootstrap
  • empirical likelihood
  • inclusion probability
  • survey weight
  • sampling design
Open Access

Estimating Literacy Levels at a Detailed Regional Level: an Application Using Dutch Data

Published Online: 15 Jun 2020
Page range: 251 - 274

Abstract

Abstract

Policy measures to combat low literacy are often targeted at municipalities or regions with low levels of literacy. However, current surveys on literacy do not contain enough observations at this level to allow for reliable estimates when using only direct estimation techniques. To provide more reliable results at a detailed regional level, alternative methods must be used.

The aim of this article is to obtain literacy estimates at the municipality level using model-based small area estimation techniques in a hierarchical Bayesian framework. To do so, we link Dutch Labour Force Survey data to the most recent literacy survey available, that of the Programme for the International Assessment of Adult Competencies (PIAAC). We estimate the average literacy score, as well as the percentage of people with a low literacy level. Variance estimators for our small area predictions explicitly account for the imputation uncertainty in the PIAAC estimates. The proposed estimation method improves the precision of the area estimates, making it possible to break down the national figures by municipality.

Keywords

  • Literacy
  • basic skills
  • municipality
  • region
  • small area estimation
Open Access

Analysing Sensitive Data from Dynamically-Generated Overlapping Contingency Tables

Published Online: 15 Jun 2020
Page range: 275 - 296

Abstract

Abstract

Contingency tables provide a convenient format to publish summary data from confidential survey and administrative records that capture a wide range of social and economic information. By their nature, contingency tables enable aggregation of potentially sensitive data, limiting disclosure of identifying information. Furthermore, censoring or perturbation can be used to desensitise low cell counts when they arise. However, access to detailed cross-classified tables for research is often restricted by data custodians when too many censored or perturbed cells are required to preserve privacy. In this article, we describe a framework for selecting and combining log-linear models when accessible data is restricted to overlapping marginal contingency tables. The approach is demonstrated through application to housing transition data from the Australian Census Longitudinal Data set provided by the Australian Bureau of Statistics.

Keywords

  • Count data
  • log-linear model
  • marginal model
  • privacy restriction
Open Access

Switching Between Different Non-Hierachical Administrative Areas via Simulated Geo-Coordinates: A Case Study for Student Residents in Berlin

Published Online: 15 Jun 2020
Page range: 297 - 314

Abstract

Abstract

The transformation of area aggregates between non-hierarchical area systems (administrative areas) is a standard problem in official statistics. For this problem, we present a proposal which is based on kernel density estimates. The approach applies a modification of a stochastic expectation maximization algorithm, which was proposed in the literature for the transformation of totals on rectangular areas to kernel density estimates. As a by-product of the routine, one obtains simulated geo-coordinates for each unit. With the help of these geo-coordinates, it is possible to calculate case numbers for any area system of interest. The proposed method is evaluated in a design-based simulation based on a close-to-reality, simulated data set with known exact geo-coordinates. In the empirical part, the method is applied to student resident figures from Berlin, Germany. These are known only at the level of ZIP codes, but they are needed for smaller administrative planning districts. Results for (a) student concentration areas and (b) temporal changes in the student residential areas between 2005 and 2015 are presented and discussed.

Keywords

  • Choropleth maps
  • kernel density estimation
  • statistical reporting
  • sub-regional estimation
  • urban development
Open Access

Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal

Published Online: 15 Jun 2020
Page range: 315 - 338

Abstract

Abstract

With the increase of social media usage, a huge new source of data has become available. Despite the enthusiasm linked to this revolution, one of the main outstanding criticisms in using these data is selection bias. Indeed, the reference population is unknown. Nevertheless, many studies show evidence that these data constitute a valuable source because they are more timely and possess higher space granularity. We propose to adjust statistics based on Twitter data by anchoring them to reliable official statistics through a weighted, space-time, small area estimation model. As a by-product, the proposed method also stabilizes the social media indicators, which is a welcome property required for official statistics. The method can be adapted anytime official statistics exists at the proper level of granularity and for which social media usage within the population is known. As an example, we adjust a subjective well-being indicator of “working conditions” in Italy, and combine it with relevant official statistics. The weights depend on broadband coverage and the Twitter rate at province level, while the analysis is performed at regional level. The resulting statistics are then compared with survey statistics on the “quality of job” at macro-economic regional level, showing evidence of similar paths.

Keywords

  • Well-being
  • big data
  • sentiment analysis
  • small area estimation
  • weighting
Open Access

Exploring Mechanisms of Recruitment and Recruitment Cooperation in Respondent Driven Sampling

Published Online: 15 Jun 2020
Page range: 339 - 360

Abstract

Abstract

Respondent driven sampling (RDS) is a sampling method designed for hard-to-sample groups with strong social ties. RDS starts with a small number of arbitrarily selected participants (“seeds”). Seeds are issued recruitment coupons, which are used to recruit from their social networks. Waves of recruitment and data collection continue until reaching a sufficient sample size. Under the assumptions of random recruitment, with-replacement sampling, and a sufficient number of waves, the probability of selection for each participant converges to be proportional to their network size. With recruitment noncooperation, however, recruitment can end abruptly, causing operational difficulties with unstable sample sizes. Noncooperation may void the recruitment Markovian assumptions, leading to selection bias. Here, we consider two RDS studies: one targeting Korean immigrants in Los Angeles and in Michigan; and another study targeting persons who inject drugs in Southeast Michigan. We explore predictors of coupon redemption, associations between recruiter and recruits, and details within recruitment dynamics. While no consistent predictors of noncooperation were found, there was evidence that coupon redemption of targeted recruits was more common among those who shared social bonds with their recruiters, suggesting that noncooperation is more likely to be a feature of recruits not cooperating, rather than recruiters failing to distribute coupons.

Keywords

  • Respondent driven sampling
  • sampling hard-to-reach population
  • nonresponse error
Open Access

Measuring the Sustainable Development Goal Indicators: An Unprecedented Statistical Challenge

Published Online: 15 Jun 2020
Page range: 361 - 378

Abstract

Abstract

In March 2017, the United Nations (UN) Statistical Commission adopted a measurement framework for the UN Agenda 2030 for Sustainable Development, comprising of 232 indicators designed to measure the 17 Sustainable Development Goals (SDGs) and their respective 169 targets. The scope of this measurement framework is so ambitious it led Mogens Lykketoft, President of the seventieth session of the UN General Assembly, to describe it as an ‘unprecedented statistical challenge’.

Naturally, with a programme of this magnitude, there will be foreseen and unforeseen challenges and consequences. This article outlines some of the key differences between the Millennium Development Goals and the SDGs, before detailing some of the measurement challenges involved in compiling the SDG indicators, and examines some of the unanticipated consequences arising from the mechanisms put in place to measure progress from a broad political economy perspective.

Keywords

  • 2030 Agenda
  • unintended consequences
  • national statistical systems
  • administrative data
Open Access

Explaining Inconsistencies in the Education Distributions of Ten Cross-National Surveys – the Role of Methodological Survey Characteristics

Published Online: 15 Jun 2020
Page range: 379 - 409

Abstract

Abstract

Surveys measuring the same concept using the same measure on the same population at the same point in time should result in highly similar results. If this is not the case, this is a strong sign of lacking reliability, resulting in non-comparable data across surveys. Looking at the education variable, previous research has identified inconsistencies in the distributions of harmonised education variables, using the International Standard Classification of Education (ISCED), across surveys within the same countries and years. These inconsistencies are commonly explained by differences in the measurement, especially in the response categories of the education question, and in the harmonisation when classifying country-specific education categories into ISCED. However, other methodological characteristics of surveys, which we regard as ‘containers’ for several characteristics, may also contribute to this finding. We compare the education distributions of nine cross-national surveys with the European Union Labour Force Survey (EU-LFS), which is used as benchmark. This study analyses 15 survey characteristics to better explain the inconsistencies. The results confirm a predominant effect of the measurement instrument and harmonisation. Different sampling designs also explain inconsistencies, but to a lesser degree. Finally, we discuss the results and limitations of the study and provide ideas for improving data comparability.

Keywords

  • Comparative research
  • cross-national surveys
  • survey characteristics
  • education
Open Access

Investigating the Effects of the Household Budget Survey Redesign on Consumption and Inequality Estimates: the Italian Experience

Published Online: 15 Jun 2020
Page range: 411 - 434

Abstract

Abstract

In 2014, many innovations were introduced in the Italian Household Budget Survey (HBS) in response to changes in European recommendations and purchasing behaviours and to an increased demand for information in the context of social and economic research. New instruments and techniques have been introduced, together with more accurate methodologies, with the aim of improving the survey, by both reducing the bias and variance of survey estimates and supplying estimation for additional subpopulations and variables. Given the parallel conduction of the former and new HBS in 2013, it has been possible to evaluate the effects of the abovementioned changes on consumption expenditure and inequality estimates and to compare the sample representativeness of selected subpopulations in both surveys.

Keywords

  • Survey design
  • data quality
  • zero expenditures
  • post-stratification
Open Access

On Accuracy Estimation Using Parametric Bootstrap in small Area Prediction Problems

Published Online: 15 Jun 2020
Page range: 435 - 458

Abstract

Abstract

We consider longitudinal data and the problem of prediction of subpopulation (domain) characteristics that can be written as a linear combination of the variable of interest, including cases of small or zero sample sizes in the domain and time period of interest. We consider the empirical version of the predictor proposed by Royall (1976) showing that it is a generalization of the empirical version of the predictor presented by Henderson (1950). We propose a parametric bootstrap MSE estimator of the predictor. We prove its asymptotic unbiasedness and derive the order of its bias. Considerations are supported by Monte Carlo simulation analyses to compare its accuracy (not only the bias) with other MSE estimators, including jackknife and weighted jackknife MSE estimators that we adapt for the considered predictor.

Keywords

  • Empirical best linear unbiased predictor
  • model approach in survey sampling
  • parametric bootstrap
  • properties of MSE estimators
  • small area estimation
Open Access

Book Review

Published Online: 15 Jun 2020
Page range: 459 - 461

Abstract