Zeitschriften und Ausgaben

Volumen 38 (2022): Heft 3 (September 2022)

Volumen 38 (2022): Heft 2 (June 2022)

Volumen 38 (2022): Heft 1 (March 2022)
Special Heft on Price Indices in Official Statistics

Volumen 37 (2021): Heft 4 (December 2021)

Volumen 37 (2021): Heft 3 (September 2021)
Special Heft on Population Statistics for the 21st Century

Volumen 37 (2021): Heft 2 (June 2021)
Special Heft on New Techniques and Technologies for Statistics

Volumen 37 (2021): Heft 1 (March 2021)

Volumen 36 (2020): Heft 4 (December 2020)

Volumen 36 (2020): Heft 3 (September 2020)
Special Heft on Nonresponse

Volumen 36 (2020): Heft 2 (June 2020)

Volumen 36 (2020): Heft 1 (March 2020)

Volumen 35 (2019): Heft 4 (December 2019)
Special Heft on Measuring LGBT Populations

Volumen 35 (2019): Heft 3 (September 2019)

Volumen 35 (2019): Heft 2 (June 2019)

Volumen 35 (2019): Heft 1 (March 2019)

Volumen 34 (2018): Heft 4 (December 2018)

Volumen 34 (2018): Heft 3 (September 2018)
Special Section on Responsive and Adaptive Survey Design

Volumen 34 (2018): Heft 2 (June 2018)
Special Heft on Establishment Surveys (ICES-V)

Volumen 34 (2018): Heft 1 (March 2018)

Volumen 33 (2017): Heft 4 (December 2017)

Volumen 33 (2017): Heft 3 (September 2017)
Special Heft on Responsive and Adaptive Survey Design

Volumen 33 (2017): Heft 2 (June 2017)
Special Heft on Total Survey Error (TSE)

Volumen 33 (2017): Heft 1 (March 2017)

Volumen 32 (2016): Heft 4 (December 2016)
Special Section on The Role of official Statistics in Statistical Capacity Building

Volumen 32 (2016): Heft 3 (September 2016)

Volumen 32 (2016): Heft 2 (June 2016)

Volumen 32 (2016): Heft 1 (March 2016)

Volumen 31 (2015): Heft 4 (December 2015)

Volumen 31 (2015): Heft 3 (September 2015)
Special Heft on Coverage Problems in Administrative Sources

Volumen 31 (2015): Heft 2 (June 2015)
Special Heft on New Techniques and Technologies for Statistics

Volumen 31 (2015): Heft 1 (March 2015)

Volumen 30 (2014): Heft 4 (December 2014)
Special Heft on Establishment Surveys

Volumen 30 (2014): Heft 3 (September 2014)

Volumen 30 (2014): Heft 2 (June 2014)
Special Heft on Surveying the Hard-to-Reach

Volumen 30 (2014): Heft 1 (March 2014)

Volumen 29 (2013): Heft 4 (December 2013)

Volumen 29 (2013): Heft 3 (September 2013)

Volumen 29 (2013): Heft 2 (June 2013)

Volumen 29 (2013): Heft 1 (March 2013)

Zeitschriftendaten
Format
Zeitschrift
eISSN
2001-7367
Erstveröffentlichung
01 Oct 2013
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch

Suche

Volumen 38 (2022): Heft 2 (June 2022)

Zeitschriftendaten
Format
Zeitschrift
eISSN
2001-7367
Erstveröffentlichung
01 Oct 2013
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch

Suche

13 Artikel
Uneingeschränkter Zugang

In Memory of Dr. Lars Lyberg Remembering a Giant in Survey Research 1944–2021

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 353 - 366

Zusammenfassung

Uneingeschränkter Zugang

Spatial Sampling Design to Improve the Efficiency of the Estimation of the Critical Parameters of the SARS-CoV-2 Epidemic

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 367 - 398

Zusammenfassung

Abstract

Given the urgent informational needs connected with the diffusion of infection with regard to the COVID-19 pandemic, in this article, we propose a sampling design for building a continuous-time surveillance system. Compared with other observational strategies, the proposed method has three important elements of strength and originality: (1) it aims to provide a snapshot of the phenomenon at a single moment in time, and it is designed to be a continuous survey that is repeated in several waves over time, taking different target variables during different stages of the development of the epidemic into account; (2) the statistical optimality properties of the proposed estimators are formally derived and tested with a Monte Carlo experiment; and (3) it is rapidly operational as this property is required by the emergency connected with the diffusion of the virus. The sampling design is thought to be designed with the diffusion of SAR-CoV-2 in Italy during the spring of 2020 in mind. However, it is very general, and we are confident that it can be easily extended to other geographical areas and to possible future epidemic outbreaks. Formal proofs and a Monte Carlo exercise highlight that the estimators are unbiased and have higher efficiency than the simple random sampling scheme.

Schlüsselwörter

  • Sampling design
  • SARS-CoV-2 diffusion
  • Health surveillance system
  • Unbiasedness
  • Efficiency
Uneingeschränkter Zugang

Assessing Residual Seasonality in the U.S. National Income and Product Accounts Aggregates

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 399 - 428

Zusammenfassung

Abstract

There is an ongoing debate on whether residual seasonality is present in the estimates of real Gross Domestic Product (GDP) in U.S. national accounts and whether it explains the slower quarter-one GDP growth rate in the recent years. This article aims to bring clarity to this topic by (1) summarizing the techniques and methodologies used in these studies; (2) arguing for a sound methodological framework for evaluating claims of residual seasonality; and (3) proposing three diagnostic tests for detecting residual seasonality, applying them to different vintages and different sample spans of data on real GDP and its major components from the U.S. national accounts and making comparisons with results from the previous studies.

Schlüsselwörter

  • Seasonality diagnostics
  • residual seasonality
Uneingeschränkter Zugang

Improved Assessment of the Accuracy of Record Linkage via an Extended MaCSim Approach

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 429 - 451

Zusammenfassung

Abstract

Record linkage is the process of bringing together the same entity from overlapping data sources while removing duplicates. Huge amounts of data are now being collected by public or private organizations as well as by researchers and individuals. Linking and analysing relevant information from this massive data reservoir can provide new insights into society. It has become increasingly important to have effective and efficient methods for linking data from different sources. Therefore, it becomes necessary to assess the ability of a linking method to achieve high accuracy or to compare between methods with respect to accuracy. In this article, we improve on a Markov Chain based Monte Carlo simulation approach (MaCSim) for assessing a linking method. The improvement proposed here involves calculation of a similarity weight for every linking variable value for each record pair, which allows partial agreement of the linking variable values. To assess the accuracy of the linking method, correctly linked proportions are investigated for each record. The extended MaCSim approach is illustrated using a synthetic data set provided by the Australian Bureau of Statistics based on realistic data settings. Test results show high accuracy of the assessment of the linkages.

Schlüsselwörter

  • Linkage accuracy
  • Markov Chain Monte Carlo
  • simulation
  • similarity weight
  • agreement threshold
  • agreement tolerance
Uneingeschränkter Zugang

If They Don’t Understand the Question, They Don’t answer. Language Mismatch in Face-to-Face Interviews

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 453 - 484

Zusammenfassung

Abstract

The provision of translated field instruments is a crucial aspect to reduce response burden and subsequently increase data quality in surveys with a multi-linguistic target population such as surveys on recent immigrants. Failure to address this can result in a mismatch between the survey language and the respondent’s mother tongue. By using a survey on refugees in Germany, this article explores the correlation of the absence of the respondents’ mother tongue on item nonresponse – a crucial aspect of data quality. Further, this article investigates whether supplementary audio recordings in the same language as the written questions can reduce item nonresponse when the mother tongue is not available. To answer the research questions, all missing answers per individual are counted and analyzed by means of poisson regression analyses. In a second step, the likelihood of item-nonresponse for single items is estimated as well. Results show that a language mismatch as well as the usage of audio recordings increase item nonresponse.

Schlüsselwörter

  • Respondent burden
  • item nonresponse
  • cross-cultural survey methods
  • poisson regression
Uneingeschränkter Zugang

Improving the Output Quality of Official Statistics Based on Machine Learning Algorithms

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 485 - 508

Zusammenfassung

Abstract

National statistical institutes currently investigate how to improve the output quality of official statistics based on machine learning algorithms. A key issue is concept drift, that is, when the joint distribution of independent variables and a dependent (categorical) variable changes over time. Under concept drift, a statistical model requires regular updating to prevent it from becoming biased. However, updating a model asks for additional data, which are not always available. An alternative is to reduce the bias by means of bias correction methods. In the article, we focus on estimating the proportion (base rate) of a category of interest and we compare two popular bias correction methods: the misclassification estimator and the calibration estimator. For prior probability shift (a specific type of concept drift), we investigate the two methods analytically as well as numerically. Our analytical results are expressions for the bias and variance of both methods. As numerical result, we present a decision boundary for the relative performance of the two methods. Our results provide a better understanding of the effect of prior probability shift on output quality. Consequently, we may recommend a novel approach on how to use machine learning algorithms in the context of official statistics.

Schlüsselwörter

  • Output quality
  • concept drift
  • prior probability shift
  • misclassification bias
Uneingeschränkter Zugang

Data Fusion for Joining Income and Consumtion Information using Different Donor-Recipient Distance Metrics

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 509 - 532

Zusammenfassung

Abstract

Data fusion describes the method of combining data from (at least) two initially independent data sources to allow for joint analysis of variables which are not jointly observed. The fundamental idea is to base inference on identifying assumptions, and on common variables which provide information that is jointly observed in all the data sources. A popular class of methods dealing with this particular missing-data problem in practice is based on covariate-based nearest neighbour matching, whereas more flexible semi- or even fully parametric approaches seem underrepresented in applied data fusion. In this article we compare two different approaches of nearest neighbour hot deck matching: One, Random Hot Deck, is a variant of the covariate-based matching methods which was proposed by Eurostat, and can be considered as a ’classical’ statistical matching method, whereas the alternative approach is based on Predictive Mean Matching. We discuss results from a simulation study where we deviate from previous analyses of marginal distributions and consider joint distributions of fusion variables instead, and our findings suggest that Predictive Mean Matching tends to outperform Random Hot Deck.

Schlüsselwörter

  • Statistical matching
  • missing data
  • predictive mean matching
  • nearest neighbour Imputation
  • missing-by-design pattern
Uneingeschränkter Zugang

Total Process Error: An Approach for Assessing and Monitoring the Quality of Multisource Processes

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 533 - 556

Zusammenfassung

Abstract

Most National Statistical Institutes are progressively moving from traditional production models to new strategies based on the combined use of different sources of information, which can be both primary and secondary. In this article, we propose a framework for assessing the quality of multisource processes, such as statistical registers.

The final aim is to develop a tool supporting decisions about the process design and its monitoring, and to provide quality measures of the whole production. The starting point is the adaptation of the life-cycle paradigm, that results in a three-phases framework described in recent literature. An evolution of this model is proposed, focusing on the first two phases of the life-cycle, to better represent the source integration/combination phase, that can vary accordingly to the features of different types of processes.

The proposed enhancement would improve the existing quality framework to support the evaluation of different multisource processes. An application of the proposed framework to two Istat (Italian national statistical institute) registers in the economic area taken as case studies is presented. These experiences show the potentials of such tool in supporting National Statistical Institutes in assessing multisource statistical production processes.

Schlüsselwörter

  • Quality framework
  • multi-source processes
  • total survey error
  • statistical register
Uneingeschränkter Zugang

Some Thoughts on Official Statistics and its Future (with discussion)

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 557 - 598

Zusammenfassung

Abstract

In this article, we share some reflections on the state of statistical science and its evolution in the production systems of official statistics. We first try to make a synthesis of the evolution of statistical thinking. We then examine the evolution of practices in official statistics, which had to face very early on a diversification of sou rces: first with the use of censuses, then sample surveys and finally administrative files. At each stage, a profound revision of methods was necessary. We show that since the middle of the 20th century, one of the major challenges of statistics has been to produce estimates from a variety of sources. To do this, a large number of methods have been proposed which are based on very different f oundations. The term “big data” encompasses a set of sources and new statistical methods. We first examine the potential of valorization of big data in official statistics. Some applications such as image analysis for agricultural prediction are very old and will be further developed. However, we report our skepticism towards web-scrapping methods. Then we examine the use of new deep learning methods. With access to more and more sources, the great challenge will remain the valorization and harmonization of these sources.

Schlüsselwörter

  • Deduction
  • foundations
  • induction
  • Lasso
  • -value
  • registers
  • sampling
  • statistical learning
Uneingeschränkter Zugang

Iterative Kernel Density Estimation Applied to Grouped Data: Estimating Poverty and Inequality Indicators from the German Microcensus

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 599 - 635

Zusammenfassung

Abstract

The estimation of poverty and inequality indicators based on survey data is trivial as long as the variable of interest (e.g., income or consumption) is measured on a metric scale. However, estimation is not directly possible, using standard formulas, when the income variable is grouped due to confidentiality constraints or in order to decrease item nonresponse. We propose an iterative kernel density algorithm that generates metric pseudo samples from the grouped variable for the estimation of indicators. The corresponding standard errors are estimated by a non-parametric bootstrap that accounts for the additional uncertainty due to the grouping. The algorithm enables the use of survey weights and household equivalence scales. The proposed method is applied to the German Microcensus for estimating the regional distribution of poverty and inequality in Germany.

Schlüsselwörter

  • Direct estimation
  • Interval-censored data
  • non-parametric estimation
  • OECD scale
  • prediction
Uneingeschränkter Zugang

Data Collection Expert Prior Elicitation in Survey Design: Two Case Studies

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 637 - 662

Zusammenfassung

Abstract

Data collection staff involved in sampling designs, monitoring and analysis of surveys often have a good sense of the response rate that can be expected in a survey, even when this survey is new or done at a relatively low frequency. They make expectations of response rates, and, subsequently, costs on an almost continuous basis. Rarely, however, are these expectations formally structured. Furthermore, the expectations usually are point estimates without any assessment of precision or uncertainty.

In recent years, the interest in adaptive survey designs has increased. These designs lean heavily on accurate estimates of response rates and costs. In order to account for inaccurate estimates, a Bayesian analysis of survey design parameters is very sensible.

The combination of strong intrinsic knowledge of data collection staff and a Bayesian analysis is a natural next step. In this article, prior elicitation is developed for design parameters with the help of data collection staff. The elicitation is applied to two case studies in which surveys underwent a major redesign and direct historic survey data was unavailable.

Schlüsselwörter

  • Nonresponse bias
  • Bayesian
  • response propensity
  • expert elicitation
Uneingeschränkter Zugang

Rejoinder: Measuring Inflation under Pandemic Conditions

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 663 - 668

Zusammenfassung

Uneingeschränkter Zugang

Book Review

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 669 - 671

Zusammenfassung

13 Artikel
Uneingeschränkter Zugang

In Memory of Dr. Lars Lyberg Remembering a Giant in Survey Research 1944–2021

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 353 - 366

Zusammenfassung

Uneingeschränkter Zugang

Spatial Sampling Design to Improve the Efficiency of the Estimation of the Critical Parameters of the SARS-CoV-2 Epidemic

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 367 - 398

Zusammenfassung

Abstract

Given the urgent informational needs connected with the diffusion of infection with regard to the COVID-19 pandemic, in this article, we propose a sampling design for building a continuous-time surveillance system. Compared with other observational strategies, the proposed method has three important elements of strength and originality: (1) it aims to provide a snapshot of the phenomenon at a single moment in time, and it is designed to be a continuous survey that is repeated in several waves over time, taking different target variables during different stages of the development of the epidemic into account; (2) the statistical optimality properties of the proposed estimators are formally derived and tested with a Monte Carlo experiment; and (3) it is rapidly operational as this property is required by the emergency connected with the diffusion of the virus. The sampling design is thought to be designed with the diffusion of SAR-CoV-2 in Italy during the spring of 2020 in mind. However, it is very general, and we are confident that it can be easily extended to other geographical areas and to possible future epidemic outbreaks. Formal proofs and a Monte Carlo exercise highlight that the estimators are unbiased and have higher efficiency than the simple random sampling scheme.

Schlüsselwörter

  • Sampling design
  • SARS-CoV-2 diffusion
  • Health surveillance system
  • Unbiasedness
  • Efficiency
Uneingeschränkter Zugang

Assessing Residual Seasonality in the U.S. National Income and Product Accounts Aggregates

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 399 - 428

Zusammenfassung

Abstract

There is an ongoing debate on whether residual seasonality is present in the estimates of real Gross Domestic Product (GDP) in U.S. national accounts and whether it explains the slower quarter-one GDP growth rate in the recent years. This article aims to bring clarity to this topic by (1) summarizing the techniques and methodologies used in these studies; (2) arguing for a sound methodological framework for evaluating claims of residual seasonality; and (3) proposing three diagnostic tests for detecting residual seasonality, applying them to different vintages and different sample spans of data on real GDP and its major components from the U.S. national accounts and making comparisons with results from the previous studies.

Schlüsselwörter

  • Seasonality diagnostics
  • residual seasonality
Uneingeschränkter Zugang

Improved Assessment of the Accuracy of Record Linkage via an Extended MaCSim Approach

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 429 - 451

Zusammenfassung

Abstract

Record linkage is the process of bringing together the same entity from overlapping data sources while removing duplicates. Huge amounts of data are now being collected by public or private organizations as well as by researchers and individuals. Linking and analysing relevant information from this massive data reservoir can provide new insights into society. It has become increasingly important to have effective and efficient methods for linking data from different sources. Therefore, it becomes necessary to assess the ability of a linking method to achieve high accuracy or to compare between methods with respect to accuracy. In this article, we improve on a Markov Chain based Monte Carlo simulation approach (MaCSim) for assessing a linking method. The improvement proposed here involves calculation of a similarity weight for every linking variable value for each record pair, which allows partial agreement of the linking variable values. To assess the accuracy of the linking method, correctly linked proportions are investigated for each record. The extended MaCSim approach is illustrated using a synthetic data set provided by the Australian Bureau of Statistics based on realistic data settings. Test results show high accuracy of the assessment of the linkages.

Schlüsselwörter

  • Linkage accuracy
  • Markov Chain Monte Carlo
  • simulation
  • similarity weight
  • agreement threshold
  • agreement tolerance
Uneingeschränkter Zugang

If They Don’t Understand the Question, They Don’t answer. Language Mismatch in Face-to-Face Interviews

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 453 - 484

Zusammenfassung

Abstract

The provision of translated field instruments is a crucial aspect to reduce response burden and subsequently increase data quality in surveys with a multi-linguistic target population such as surveys on recent immigrants. Failure to address this can result in a mismatch between the survey language and the respondent’s mother tongue. By using a survey on refugees in Germany, this article explores the correlation of the absence of the respondents’ mother tongue on item nonresponse – a crucial aspect of data quality. Further, this article investigates whether supplementary audio recordings in the same language as the written questions can reduce item nonresponse when the mother tongue is not available. To answer the research questions, all missing answers per individual are counted and analyzed by means of poisson regression analyses. In a second step, the likelihood of item-nonresponse for single items is estimated as well. Results show that a language mismatch as well as the usage of audio recordings increase item nonresponse.

Schlüsselwörter

  • Respondent burden
  • item nonresponse
  • cross-cultural survey methods
  • poisson regression
Uneingeschränkter Zugang

Improving the Output Quality of Official Statistics Based on Machine Learning Algorithms

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 485 - 508

Zusammenfassung

Abstract

National statistical institutes currently investigate how to improve the output quality of official statistics based on machine learning algorithms. A key issue is concept drift, that is, when the joint distribution of independent variables and a dependent (categorical) variable changes over time. Under concept drift, a statistical model requires regular updating to prevent it from becoming biased. However, updating a model asks for additional data, which are not always available. An alternative is to reduce the bias by means of bias correction methods. In the article, we focus on estimating the proportion (base rate) of a category of interest and we compare two popular bias correction methods: the misclassification estimator and the calibration estimator. For prior probability shift (a specific type of concept drift), we investigate the two methods analytically as well as numerically. Our analytical results are expressions for the bias and variance of both methods. As numerical result, we present a decision boundary for the relative performance of the two methods. Our results provide a better understanding of the effect of prior probability shift on output quality. Consequently, we may recommend a novel approach on how to use machine learning algorithms in the context of official statistics.

Schlüsselwörter

  • Output quality
  • concept drift
  • prior probability shift
  • misclassification bias
Uneingeschränkter Zugang

Data Fusion for Joining Income and Consumtion Information using Different Donor-Recipient Distance Metrics

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 509 - 532

Zusammenfassung

Abstract

Data fusion describes the method of combining data from (at least) two initially independent data sources to allow for joint analysis of variables which are not jointly observed. The fundamental idea is to base inference on identifying assumptions, and on common variables which provide information that is jointly observed in all the data sources. A popular class of methods dealing with this particular missing-data problem in practice is based on covariate-based nearest neighbour matching, whereas more flexible semi- or even fully parametric approaches seem underrepresented in applied data fusion. In this article we compare two different approaches of nearest neighbour hot deck matching: One, Random Hot Deck, is a variant of the covariate-based matching methods which was proposed by Eurostat, and can be considered as a ’classical’ statistical matching method, whereas the alternative approach is based on Predictive Mean Matching. We discuss results from a simulation study where we deviate from previous analyses of marginal distributions and consider joint distributions of fusion variables instead, and our findings suggest that Predictive Mean Matching tends to outperform Random Hot Deck.

Schlüsselwörter

  • Statistical matching
  • missing data
  • predictive mean matching
  • nearest neighbour Imputation
  • missing-by-design pattern
Uneingeschränkter Zugang

Total Process Error: An Approach for Assessing and Monitoring the Quality of Multisource Processes

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 533 - 556

Zusammenfassung

Abstract

Most National Statistical Institutes are progressively moving from traditional production models to new strategies based on the combined use of different sources of information, which can be both primary and secondary. In this article, we propose a framework for assessing the quality of multisource processes, such as statistical registers.

The final aim is to develop a tool supporting decisions about the process design and its monitoring, and to provide quality measures of the whole production. The starting point is the adaptation of the life-cycle paradigm, that results in a three-phases framework described in recent literature. An evolution of this model is proposed, focusing on the first two phases of the life-cycle, to better represent the source integration/combination phase, that can vary accordingly to the features of different types of processes.

The proposed enhancement would improve the existing quality framework to support the evaluation of different multisource processes. An application of the proposed framework to two Istat (Italian national statistical institute) registers in the economic area taken as case studies is presented. These experiences show the potentials of such tool in supporting National Statistical Institutes in assessing multisource statistical production processes.

Schlüsselwörter

  • Quality framework
  • multi-source processes
  • total survey error
  • statistical register
Uneingeschränkter Zugang

Some Thoughts on Official Statistics and its Future (with discussion)

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 557 - 598

Zusammenfassung

Abstract

In this article, we share some reflections on the state of statistical science and its evolution in the production systems of official statistics. We first try to make a synthesis of the evolution of statistical thinking. We then examine the evolution of practices in official statistics, which had to face very early on a diversification of sou rces: first with the use of censuses, then sample surveys and finally administrative files. At each stage, a profound revision of methods was necessary. We show that since the middle of the 20th century, one of the major challenges of statistics has been to produce estimates from a variety of sources. To do this, a large number of methods have been proposed which are based on very different f oundations. The term “big data” encompasses a set of sources and new statistical methods. We first examine the potential of valorization of big data in official statistics. Some applications such as image analysis for agricultural prediction are very old and will be further developed. However, we report our skepticism towards web-scrapping methods. Then we examine the use of new deep learning methods. With access to more and more sources, the great challenge will remain the valorization and harmonization of these sources.

Schlüsselwörter

  • Deduction
  • foundations
  • induction
  • Lasso
  • -value
  • registers
  • sampling
  • statistical learning
Uneingeschränkter Zugang

Iterative Kernel Density Estimation Applied to Grouped Data: Estimating Poverty and Inequality Indicators from the German Microcensus

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 599 - 635

Zusammenfassung

Abstract

The estimation of poverty and inequality indicators based on survey data is trivial as long as the variable of interest (e.g., income or consumption) is measured on a metric scale. However, estimation is not directly possible, using standard formulas, when the income variable is grouped due to confidentiality constraints or in order to decrease item nonresponse. We propose an iterative kernel density algorithm that generates metric pseudo samples from the grouped variable for the estimation of indicators. The corresponding standard errors are estimated by a non-parametric bootstrap that accounts for the additional uncertainty due to the grouping. The algorithm enables the use of survey weights and household equivalence scales. The proposed method is applied to the German Microcensus for estimating the regional distribution of poverty and inequality in Germany.

Schlüsselwörter

  • Direct estimation
  • Interval-censored data
  • non-parametric estimation
  • OECD scale
  • prediction
Uneingeschränkter Zugang

Data Collection Expert Prior Elicitation in Survey Design: Two Case Studies

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 637 - 662

Zusammenfassung

Abstract

Data collection staff involved in sampling designs, monitoring and analysis of surveys often have a good sense of the response rate that can be expected in a survey, even when this survey is new or done at a relatively low frequency. They make expectations of response rates, and, subsequently, costs on an almost continuous basis. Rarely, however, are these expectations formally structured. Furthermore, the expectations usually are point estimates without any assessment of precision or uncertainty.

In recent years, the interest in adaptive survey designs has increased. These designs lean heavily on accurate estimates of response rates and costs. In order to account for inaccurate estimates, a Bayesian analysis of survey design parameters is very sensible.

The combination of strong intrinsic knowledge of data collection staff and a Bayesian analysis is a natural next step. In this article, prior elicitation is developed for design parameters with the help of data collection staff. The elicitation is applied to two case studies in which surveys underwent a major redesign and direct historic survey data was unavailable.

Schlüsselwörter

  • Nonresponse bias
  • Bayesian
  • response propensity
  • expert elicitation
Uneingeschränkter Zugang

Rejoinder: Measuring Inflation under Pandemic Conditions

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 663 - 668

Zusammenfassung

Uneingeschränkter Zugang

Book Review

Online veröffentlicht: 14 Jun 2022
Seitenbereich: 669 - 671

Zusammenfassung

Planen Sie Ihre Fernkonferenz mit Scienceendo