Magazine et Edition

Volume 39 (2023): Edition 3 (September 2023)

Volume 39 (2023): Edition 2 (June 2023)

Volume 39 (2023): Edition 1 (March 2023)

Volume 38 (2022): Edition 4 (December 2022)
Special Edition on Respondent Burden

Volume 38 (2022): Edition 3 (September 2022)

Volume 38 (2022): Edition 2 (June 2022)

Volume 38 (2022): Edition 1 (March 2022)
Special Edition on Price Indices in Official Statistics

Volume 37 (2021): Edition 4 (December 2021)

Volume 37 (2021): Edition 3 (September 2021)
Special Edition on Population Statistics for the 21st Century

Volume 37 (2021): Edition 2 (June 2021)
Special Edition on New Techniques and Technologies for Statistics

Volume 37 (2021): Edition 1 (March 2021)

Volume 36 (2020): Edition 4 (December 2020)

Volume 36 (2020): Edition 3 (September 2020)
Special Edition on Nonresponse

Volume 36 (2020): Edition 2 (June 2020)

Volume 36 (2020): Edition 1 (March 2020)

Volume 35 (2019): Edition 4 (December 2019)
Special Edition on Measuring LGBT Populations

Volume 35 (2019): Edition 3 (September 2019)

Volume 35 (2019): Edition 2 (June 2019)

Volume 35 (2019): Edition 1 (March 2019)

Volume 34 (2018): Edition 4 (December 2018)

Volume 34 (2018): Edition 3 (September 2018)
Special Section on Responsive and Adaptive Survey Design

Volume 34 (2018): Edition 2 (June 2018)
Special Edition on Establishment Surveys (ICES-V)

Volume 34 (2018): Edition 1 (March 2018)

Volume 33 (2017): Edition 4 (December 2017)

Volume 33 (2017): Edition 3 (September 2017)
Special Edition on Responsive and Adaptive Survey Design

Volume 33 (2017): Edition 2 (June 2017)
Special Edition on Total Survey Error (TSE)

Volume 33 (2017): Edition 1 (March 2017)

Volume 32 (2016): Edition 4 (December 2016)
Special Section on The Role of official Statistics in Statistical Capacity Building

Volume 32 (2016): Edition 3 (September 2016)

Volume 32 (2016): Edition 2 (June 2016)

Volume 32 (2016): Edition 1 (March 2016)

Volume 31 (2015): Edition 4 (December 2015)

Volume 31 (2015): Edition 3 (September 2015)
Special Edition on Coverage Problems in Administrative Sources

Volume 31 (2015): Edition 2 (June 2015)
Special Edition on New Techniques and Technologies for Statistics

Volume 31 (2015): Edition 1 (March 2015)

Volume 30 (2014): Edition 4 (December 2014)
Special Edition on Establishment Surveys

Volume 30 (2014): Edition 3 (September 2014)

Volume 30 (2014): Edition 2 (June 2014)
Special Edition on Surveying the Hard-to-Reach

Volume 30 (2014): Edition 1 (March 2014)

Volume 29 (2013): Edition 4 (December 2013)

Volume 29 (2013): Edition 3 (June 2013)

Volume 29 (2013): Edition 2 (September 2013)

Volume 29 (2013): Edition 1 (March 2013)

Détails du magazine
Format
Magazine
eISSN
2001-7367
Première publication
01 Oct 2013
Période de publication
4 fois par an
Langues
Anglais

Chercher

Volume 32 (2016): Edition 2 (June 2016)

Détails du magazine
Format
Magazine
eISSN
2001-7367
Première publication
01 Oct 2013
Période de publication
4 fois par an
Langues
Anglais

Chercher

0 Articles
Accès libre

On a Modular Approach to the Design of Integrated Social Surveys

Publié en ligne: 28 May 2016
Pages: 259 - 286

Résumé

Abstract

This article considers a modular approach to the design of integrated social surveys. The approach consists of grouping variables into ‘modules’, each of which is then allocated to one or more ‘instruments’. Each instrument is then administered to a random sample of population units, and each sample unit responds to all modules of the instrument. This approach offers a way of designing a system of integrated social surveys that balances the need to limit the cost and the need to obtain sufficient information. The allocation of the modules to instruments draws on the methodology of split questionnaire designs. The composition of the instruments, that is, how the modules are allocated to instruments, and the corresponding sample sizes are obtained as a solution to an optimisation problem. This optimisation involves minimisation of respondent burden and data collection cost, while respecting certain design constraints usually encountered in practice. These constraints may include, for example, the level of precision required and dependencies between the variables. We propose using a random search algorithm to find approximate optimal solutions to this problem. The algorithm is proved to fulfil conditions that ensure convergence to the global optimum and can also produce an efficient design for a split questionnaire.

Mots clés

  • Efficient design
  • respondent burden
  • sample allocation
  • simulated annealing
  • split questionnaire
Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 287 - 289

Résumé

Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 291 - 294

Résumé

Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 295 - 300

Résumé

Accès libre

Rejoinder

Publié en ligne: 28 May 2016
Pages: 301 - 305

Résumé

Accès libre

The Impact of Question Format, Context, and Content on Survey Answers in Early and Late Adolescence

Publié en ligne: 28 May 2016
Pages: 307 - 328

Résumé

Abstract

Self-reports in surveys are often influenced by the presented question format and question context. Much less is known about how these effects influence the answers of younger survey respondents. The present study investigated how variations in response format, answer scale frequency, and question order influence self-reports of two age groups: younger (11–13 years old) and older (16–18 years old) adolescents. In addition, the impact of the respondents’ level of familiarity with the question content was taken into account. Results indicated that younger adolescents are more strongly influenced by the presented question format and context than older adolescents. This, however, was dependent on the particular question content, implying that response effects are more pronounced when questions deal with issues that lie outside of the respondents’ field of experience. Implications of these findings in survey research with younger respondents are discussed.

Mots clés

  • Attitude judgments
  • question order effects
  • social influence
  • survey methodology
  • younger survey respondents
Accès libre

End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets

Publié en ligne: 28 May 2016
Pages: 329 - 348

Résumé

Abstract

In the UK, the transparency agenda is forcing data stewardship organisations to review their dissemination policies and to consider whether to release data that is currently only available to a restricted community of researchers under licence as open data. Here we describe the results of a study providing evidence about the risks of such an approach via a simulated attack on two social survey datasets. This is also the first systematic attempt to simulate a jigsaw identification attack (one using a mashup of multiple data sources) on an anonymised dataset. The information that we draw on is collected from multiple online data sources and purchasable commercial data. The results indicate that such an attack against anonymised end user licence (EUL) datasets, if converted into open datasets, is possible and therefore we would recommend that penetration tests should be factored into any decision to make datasets (that are about people) open.

Accès libre

Interviewer Effects on a Network-Size Filter Question

Publié en ligne: 28 May 2016
Pages: 349 - 373

Résumé

Abstract

There is evidence that survey interviewers may be tempted to manipulate answers to filter questions in a way that minimizes the number of follow-up questions. This becomes relevant when ego-centered network data are collected. The reported network size has a huge impact on interview duration if multiple questions on each alter are triggered. We analyze interviewer effects on a network-size question in the mixed-mode survey “Panel Study ‘Labour Market and Social Security’” (PASS), where interviewers could skip up to 15 follow-up questions by generating small networks. Applying multilevel models, we find almost no interviewer effects in CATI mode, where interviewers are paid by the hour and frequently supervised. In CAPI, however, where interviewers are paid by case and no close supervision is possible, we find strong interviewer effects on network size. As the area-specific network size is known from telephone mode, where allocation to interviewers is random, interviewer and area effects can be separated. Furthermore, a difference-in-difference analysis reveals the negative effect of introducing the follow-up questions in Wave 3 on CAPI network size. Attempting to explain interviewer effects we neither find significant main effects of experience within a wave, nor significantly different slopes between interviewers.

Mots clés

  • Partial falsification
  • network generator
  • filter questions
  • interviewer cheating
Accès libre

The FEWS Index: Fixed Effects with a Window Splice

Publié en ligne: 28 May 2016
Pages: 375 - 404

Résumé

Abstract

This article describes the estimation of quality-adjusted price indexes from ‘big data’ such as scanner and online data when there is no available information on product characteristics for explicit quality adjustment using hedonic regression. The longitudinal information can be exploited to implicitly quality-adjust the price indexes. The fixed-effects (or ‘time-product dummy’) index is shown to be equivalent to a fully interacted time-dummy hedonic index based on all price-determining characteristics of the products, despite those characteristics not being observed. In production, this can be combined with a modified approach to splicing that incorporates the price movement across the full estimation window to reflect new products with one period’s lag without requiring revision. Empirical results for this fixed-effects window-splice (FEWS) index are presented for different data sources: three years of New Zealand consumer electronics scanner data from market-research company GfK; six years of United States supermarket scanner data from market-research company IRI; and 15 months of New Zealand consumer electronics daily online data from MIT’s Billion Prices Project.

Mots clés

  • Big data
  • scanner data
  • online data
  • hedonic regression
  • quality adjustment
Accès libre

“Do the Germans Really Work Six Weeks More than the French?” – Measuring Working Time with the Labour Force Survey in France and Germany

Publié en ligne: 28 May 2016
Pages: 405 - 431

Résumé

Abstract

Measuring working time is not only an important objective of the EU Labour Force Survey (LFS), but also a highly demanding task in terms of methodology. Against the background of a recent debate on the comparability of working time estimates in France and Germany, this article presents a comparative assessment of the measurement of working time in the Labour Force Survey obtained in both countries. It focuses on the measurement of the hours actually worked, the key working-time concept for short-term economic analysis and the National Accounts. The contribution systematically analyses the differences in the measurement approaches used in France and Germany in order to identify the methodological effects that hinder comparability. It comes to the conclusion that the LFS overstates the difference in hours actually worked in France and Germany and identifies question comprehension, rounding, editing effects, as well as certain aspects of the sampling design, as crucial factors of a reliable measurement in particular of absences from work during the reference week. We recommend continuing the work started in the European Statistical System towards the development of a model questionnaire in order to improve cross-national harmonisation of key variables such as hours actually worked.

Mots clés

  • Nonsampling errors
  • measurement error
  • questionnaire design
  • working hours
  • international comparability
Accès libre

Random Walks on Directed Networks: Inference and Respondent-Driven Sampling

Publié en ligne: 28 May 2016
Pages: 433 - 459

Résumé

Abstract

Respondent-driven sampling (RDS) is often used to estimate population properties (e.g., sexual risk behavior) in hard-to-reach populations. In RDS, already sampled individuals recruit population members to the sample from their social contacts in an efficient snowball-like sampling procedure. By assuming a Markov model for the recruitment of individuals, asymptotically unbiased estimates of population characteristics can be obtained. Current RDS estimation methodology assumes that the social network is undirected, that is, all edges are reciprocal. However, empirical social networks in general also include a substantial number of nonreciprocal edges. In this article, we develop an estimation method for RDS in populations connected by social networks that include reciprocal and nonreciprocal edges. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. The proposed estimators are evaluated on artificial and empirical networks and are shown to generally perform better than existing estimators. This is the case in particular when the fraction of directed edges in the network is large.

Mots clés

  • Hidden population
  • social network
  • nonreciprocal relationship
  • Markov model
Accès libre

Modernizing a Major Federal Government Survey: A Review of the Redesign of the Current Population Survey Health Insurance Questions

Publié en ligne: 28 May 2016
Pages: 461 - 486

Résumé

Abstract

Measurement error can be very difficult to assess and reduce. While great strides have been made in the field of survey methods research in recent years, many ongoing federal surveys were initiated decades ago, before testing methods were fully developed. However, the longer a survey is in use, the more established the time series becomes, and any change to a questionnaire risks a break in that time series. This article documents how a major federal survey – the health insurance module of the Current Population Survey (CPS) – was redesigned over the course of 15 years through a systematic series of small, iterative tests, both qualitative and quantitative. This overview summarizes those tests and results, and illustrates how particular questionnaire design features were identified as problematic, and how improvements were developed and evaluated. While the particular topic is health insurance, the general approach (a coordinated series of small tests), along with the specific tests and methods employed, are not uniquely applicable to health insurance. Furthermore, the particular questionnaire design features of the CPS health module that were found to be most problematic are used in many other major surveys on a range of topic areas.

Mots clés

  • Health reform
  • questionnaire redesign
  • health insurance
  • CPS
Accès libre

Misspecification Effects in the Analysis of Panel Data

Publié en ligne: 28 May 2016
Pages: 487 - 505

Résumé

Abstract

Misspecification effects (meffs) measure the effect on the sampling variance of an estimator of incorrect specification of both the sampling scheme and the model considered. We assess the effect of various features of complex sampling schemes on the inferences drawn from models for panel data using meffs. Many longitudinal social survey designs employ multistage sampling, leading to some clustering, which tends to lead to meffs greater than unity. An empirical study using data from the British Household Panel Survey is conducted, and a simulation study is performed. Our results suggest that clustering impacts are stronger for longitudinal studies than for cross-sectional studies, and that meffs for the regression coefficients increase with the number of waves analysed. Hence, estimated standard errors in the analysis of panel data can be misleading if any clustering is ignored.

Mots clés

  • Longitudinal survey
  • sampling variance
  • multistage sampling
  • stratification
  • weighting
Accès libre

Weight Smoothing for Generalized Linear Models Using a Laplace Prior

Publié en ligne: 28 May 2016
Pages: 507 - 539

Résumé

Abstract

When analyzing data sampled with unequal inclusion probabilities, correlations between the probability of selection and the sampled data can induce bias if the inclusion probabilities are ignored in the analysis. Weights equal to the inverse of the probability of inclusion are commonly used to correct possible bias. When weights are uncorrelated with the descriptive or model estimators of interest, highly disproportional sample designs resulting in large weights can introduce unnecessary variability, leading to an overall larger mean square error compared to unweighted methods.

We describe an approach we term ‘weight smoothing’ that models the interactions between the weights and the estimators as random effects, reducing the root mean square error (RMSE) by shrinking interactions toward zero when such shrinkage is allowed by the data. This article adapts a flexible Laplace prior distribution for the hierarchical Bayesian model to gain a more robust bias-variance tradeoff than previous approaches using normal priors. Simulation and application suggest that under a linear model setting, weight-smoothing models with Laplace priors yield robust results when weighting is necessary, and provide considerable reduction in RMSE otherwise. In logistic regression models, estimates using weight-smoothing models with Laplace priors are robust, but with less gain in efficiency than in linear regression settings.

Mots clés

  • Weight trimming
  • winsorization
  • Bayesian finite population inference
  • Hierarchical models
Accès libre

Book Review

Publié en ligne: 28 May 2016
Pages: 541 - 544

Résumé

Accès libre

Book Review

Publié en ligne: 28 May 2016
Pages: 545 - 547

Résumé

0 Articles
Accès libre

On a Modular Approach to the Design of Integrated Social Surveys

Publié en ligne: 28 May 2016
Pages: 259 - 286

Résumé

Abstract

This article considers a modular approach to the design of integrated social surveys. The approach consists of grouping variables into ‘modules’, each of which is then allocated to one or more ‘instruments’. Each instrument is then administered to a random sample of population units, and each sample unit responds to all modules of the instrument. This approach offers a way of designing a system of integrated social surveys that balances the need to limit the cost and the need to obtain sufficient information. The allocation of the modules to instruments draws on the methodology of split questionnaire designs. The composition of the instruments, that is, how the modules are allocated to instruments, and the corresponding sample sizes are obtained as a solution to an optimisation problem. This optimisation involves minimisation of respondent burden and data collection cost, while respecting certain design constraints usually encountered in practice. These constraints may include, for example, the level of precision required and dependencies between the variables. We propose using a random search algorithm to find approximate optimal solutions to this problem. The algorithm is proved to fulfil conditions that ensure convergence to the global optimum and can also produce an efficient design for a split questionnaire.

Mots clés

  • Efficient design
  • respondent burden
  • sample allocation
  • simulated annealing
  • split questionnaire
Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 287 - 289

Résumé

Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 291 - 294

Résumé

Accès libre

Discussion

Publié en ligne: 28 May 2016
Pages: 295 - 300

Résumé

Accès libre

Rejoinder

Publié en ligne: 28 May 2016
Pages: 301 - 305

Résumé

Accès libre

The Impact of Question Format, Context, and Content on Survey Answers in Early and Late Adolescence

Publié en ligne: 28 May 2016
Pages: 307 - 328

Résumé

Abstract

Self-reports in surveys are often influenced by the presented question format and question context. Much less is known about how these effects influence the answers of younger survey respondents. The present study investigated how variations in response format, answer scale frequency, and question order influence self-reports of two age groups: younger (11–13 years old) and older (16–18 years old) adolescents. In addition, the impact of the respondents’ level of familiarity with the question content was taken into account. Results indicated that younger adolescents are more strongly influenced by the presented question format and context than older adolescents. This, however, was dependent on the particular question content, implying that response effects are more pronounced when questions deal with issues that lie outside of the respondents’ field of experience. Implications of these findings in survey research with younger respondents are discussed.

Mots clés

  • Attitude judgments
  • question order effects
  • social influence
  • survey methodology
  • younger survey respondents
Accès libre

End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets

Publié en ligne: 28 May 2016
Pages: 329 - 348

Résumé

Abstract

In the UK, the transparency agenda is forcing data stewardship organisations to review their dissemination policies and to consider whether to release data that is currently only available to a restricted community of researchers under licence as open data. Here we describe the results of a study providing evidence about the risks of such an approach via a simulated attack on two social survey datasets. This is also the first systematic attempt to simulate a jigsaw identification attack (one using a mashup of multiple data sources) on an anonymised dataset. The information that we draw on is collected from multiple online data sources and purchasable commercial data. The results indicate that such an attack against anonymised end user licence (EUL) datasets, if converted into open datasets, is possible and therefore we would recommend that penetration tests should be factored into any decision to make datasets (that are about people) open.

Accès libre

Interviewer Effects on a Network-Size Filter Question

Publié en ligne: 28 May 2016
Pages: 349 - 373

Résumé

Abstract

There is evidence that survey interviewers may be tempted to manipulate answers to filter questions in a way that minimizes the number of follow-up questions. This becomes relevant when ego-centered network data are collected. The reported network size has a huge impact on interview duration if multiple questions on each alter are triggered. We analyze interviewer effects on a network-size question in the mixed-mode survey “Panel Study ‘Labour Market and Social Security’” (PASS), where interviewers could skip up to 15 follow-up questions by generating small networks. Applying multilevel models, we find almost no interviewer effects in CATI mode, where interviewers are paid by the hour and frequently supervised. In CAPI, however, where interviewers are paid by case and no close supervision is possible, we find strong interviewer effects on network size. As the area-specific network size is known from telephone mode, where allocation to interviewers is random, interviewer and area effects can be separated. Furthermore, a difference-in-difference analysis reveals the negative effect of introducing the follow-up questions in Wave 3 on CAPI network size. Attempting to explain interviewer effects we neither find significant main effects of experience within a wave, nor significantly different slopes between interviewers.

Mots clés

  • Partial falsification
  • network generator
  • filter questions
  • interviewer cheating
Accès libre

The FEWS Index: Fixed Effects with a Window Splice

Publié en ligne: 28 May 2016
Pages: 375 - 404

Résumé

Abstract

This article describes the estimation of quality-adjusted price indexes from ‘big data’ such as scanner and online data when there is no available information on product characteristics for explicit quality adjustment using hedonic regression. The longitudinal information can be exploited to implicitly quality-adjust the price indexes. The fixed-effects (or ‘time-product dummy’) index is shown to be equivalent to a fully interacted time-dummy hedonic index based on all price-determining characteristics of the products, despite those characteristics not being observed. In production, this can be combined with a modified approach to splicing that incorporates the price movement across the full estimation window to reflect new products with one period’s lag without requiring revision. Empirical results for this fixed-effects window-splice (FEWS) index are presented for different data sources: three years of New Zealand consumer electronics scanner data from market-research company GfK; six years of United States supermarket scanner data from market-research company IRI; and 15 months of New Zealand consumer electronics daily online data from MIT’s Billion Prices Project.

Mots clés

  • Big data
  • scanner data
  • online data
  • hedonic regression
  • quality adjustment
Accès libre

“Do the Germans Really Work Six Weeks More than the French?” – Measuring Working Time with the Labour Force Survey in France and Germany

Publié en ligne: 28 May 2016
Pages: 405 - 431

Résumé

Abstract

Measuring working time is not only an important objective of the EU Labour Force Survey (LFS), but also a highly demanding task in terms of methodology. Against the background of a recent debate on the comparability of working time estimates in France and Germany, this article presents a comparative assessment of the measurement of working time in the Labour Force Survey obtained in both countries. It focuses on the measurement of the hours actually worked, the key working-time concept for short-term economic analysis and the National Accounts. The contribution systematically analyses the differences in the measurement approaches used in France and Germany in order to identify the methodological effects that hinder comparability. It comes to the conclusion that the LFS overstates the difference in hours actually worked in France and Germany and identifies question comprehension, rounding, editing effects, as well as certain aspects of the sampling design, as crucial factors of a reliable measurement in particular of absences from work during the reference week. We recommend continuing the work started in the European Statistical System towards the development of a model questionnaire in order to improve cross-national harmonisation of key variables such as hours actually worked.

Mots clés

  • Nonsampling errors
  • measurement error
  • questionnaire design
  • working hours
  • international comparability
Accès libre

Random Walks on Directed Networks: Inference and Respondent-Driven Sampling

Publié en ligne: 28 May 2016
Pages: 433 - 459

Résumé

Abstract

Respondent-driven sampling (RDS) is often used to estimate population properties (e.g., sexual risk behavior) in hard-to-reach populations. In RDS, already sampled individuals recruit population members to the sample from their social contacts in an efficient snowball-like sampling procedure. By assuming a Markov model for the recruitment of individuals, asymptotically unbiased estimates of population characteristics can be obtained. Current RDS estimation methodology assumes that the social network is undirected, that is, all edges are reciprocal. However, empirical social networks in general also include a substantial number of nonreciprocal edges. In this article, we develop an estimation method for RDS in populations connected by social networks that include reciprocal and nonreciprocal edges. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. The proposed estimators are evaluated on artificial and empirical networks and are shown to generally perform better than existing estimators. This is the case in particular when the fraction of directed edges in the network is large.

Mots clés

  • Hidden population
  • social network
  • nonreciprocal relationship
  • Markov model
Accès libre

Modernizing a Major Federal Government Survey: A Review of the Redesign of the Current Population Survey Health Insurance Questions

Publié en ligne: 28 May 2016
Pages: 461 - 486

Résumé

Abstract

Measurement error can be very difficult to assess and reduce. While great strides have been made in the field of survey methods research in recent years, many ongoing federal surveys were initiated decades ago, before testing methods were fully developed. However, the longer a survey is in use, the more established the time series becomes, and any change to a questionnaire risks a break in that time series. This article documents how a major federal survey – the health insurance module of the Current Population Survey (CPS) – was redesigned over the course of 15 years through a systematic series of small, iterative tests, both qualitative and quantitative. This overview summarizes those tests and results, and illustrates how particular questionnaire design features were identified as problematic, and how improvements were developed and evaluated. While the particular topic is health insurance, the general approach (a coordinated series of small tests), along with the specific tests and methods employed, are not uniquely applicable to health insurance. Furthermore, the particular questionnaire design features of the CPS health module that were found to be most problematic are used in many other major surveys on a range of topic areas.

Mots clés

  • Health reform
  • questionnaire redesign
  • health insurance
  • CPS
Accès libre

Misspecification Effects in the Analysis of Panel Data

Publié en ligne: 28 May 2016
Pages: 487 - 505

Résumé

Abstract

Misspecification effects (meffs) measure the effect on the sampling variance of an estimator of incorrect specification of both the sampling scheme and the model considered. We assess the effect of various features of complex sampling schemes on the inferences drawn from models for panel data using meffs. Many longitudinal social survey designs employ multistage sampling, leading to some clustering, which tends to lead to meffs greater than unity. An empirical study using data from the British Household Panel Survey is conducted, and a simulation study is performed. Our results suggest that clustering impacts are stronger for longitudinal studies than for cross-sectional studies, and that meffs for the regression coefficients increase with the number of waves analysed. Hence, estimated standard errors in the analysis of panel data can be misleading if any clustering is ignored.

Mots clés

  • Longitudinal survey
  • sampling variance
  • multistage sampling
  • stratification
  • weighting
Accès libre

Weight Smoothing for Generalized Linear Models Using a Laplace Prior

Publié en ligne: 28 May 2016
Pages: 507 - 539

Résumé

Abstract

When analyzing data sampled with unequal inclusion probabilities, correlations between the probability of selection and the sampled data can induce bias if the inclusion probabilities are ignored in the analysis. Weights equal to the inverse of the probability of inclusion are commonly used to correct possible bias. When weights are uncorrelated with the descriptive or model estimators of interest, highly disproportional sample designs resulting in large weights can introduce unnecessary variability, leading to an overall larger mean square error compared to unweighted methods.

We describe an approach we term ‘weight smoothing’ that models the interactions between the weights and the estimators as random effects, reducing the root mean square error (RMSE) by shrinking interactions toward zero when such shrinkage is allowed by the data. This article adapts a flexible Laplace prior distribution for the hierarchical Bayesian model to gain a more robust bias-variance tradeoff than previous approaches using normal priors. Simulation and application suggest that under a linear model setting, weight-smoothing models with Laplace priors yield robust results when weighting is necessary, and provide considerable reduction in RMSE otherwise. In logistic regression models, estimates using weight-smoothing models with Laplace priors are robust, but with less gain in efficiency than in linear regression settings.

Mots clés

  • Weight trimming
  • winsorization
  • Bayesian finite population inference
  • Hierarchical models
Accès libre

Book Review

Publié en ligne: 28 May 2016
Pages: 541 - 544

Résumé

Accès libre

Book Review

Publié en ligne: 28 May 2016
Pages: 545 - 547

Résumé