1. bookVolume 37 (2021): Issue 3 (September 2021)
    Special Issue on Population Statistics for the 21st Century
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
access type Open Access

A Simulation Study of Diagnostics for Selection Bias

Published Online: 13 Sep 2021
Page range: 751 - 769
Received: 01 Jul 2019
Accepted: 01 Nov 2020
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
Abstract

A non-probability sampling mechanism arising from nonresponse or non-selection is likely to bias estimates of parameters with respect to a target population of interest. This bias poses a unique challenge when selection is ‘non-ignorable’, that is, dependent on the unobserved outcome of interest, since it is then undetectable and thus cannot be ameliorated. We extend a simulation study by Nishimura et al. (2016) adding two recently published statistics: the ‘standardized measure of unadjusted bias’ (SMUB) and ‘standardized measure of adjusted bias’ (SMAB), which explicitly quantify the extent of bias (in the case of SMUB) or nonignorable bias (in the case of SMAB) under the assumption that a specified amount of nonignorable selection exists. Our findings suggest that this new sensitivity diagnostic is more correlated with, and more predictive of, the true, unknown extent of selection bias than other diagnostics, even when the underlying assumed level of non-ignorability is incorrect.

Keywords

Albert, A., and J. Anderson. 1984. “On the existence of maximum likelihood estimates in logistic regression models.” Biometrika 71: 1–10. DOI: https://doi.org/10.2307/2336390. Search in Google Scholar

Andridge, R.R., and R.J. Little. 2011. “Proxy pattern-mixture analysis for survey nonresponse.” Journal of Official Statistics 27: 153–180. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/proxy-pattern-mixture-analysis-for-survey-nonresponse.pdf (accessed May 2021). Search in Google Scholar

Andridge, R.R., and R.J. Little. 2020. “Proxy pattern-mixture analysis for a binary variable subject to nonresponse.” Journal of Official Statistics. DOI: https://doi.org/10.2478/jos-2020-0035. Search in Google Scholar

Bootsma-van der Wiel, A.V., E. Van Exel, A. De Craen, J. Gussekloo, A. Lagaay, D. Knook, and R. Westendorp. 2002. “A high response is not essential to prevent selection bias: results from the leiden 85-plus study.” Journal of Clinical Epidemiology 55: 1119–1125. DOI: https://doi.org/10.1016/s0895-4356(02)00505-x. Search in Google Scholar

Brick, J.M., and D. Williams. 2013. “Explaining rising nonresponse rates in cross-sectional surveys.” The Annals of the American Academy of Political and Social Science 645: 36–59. DOI: https://doi.org/10.1177%2F0002716212456834. Search in Google Scholar

Heckman, J.J. 1979. “Sample selection bias as a specification error.” Econometrica 47: 153–161. DOI: https://doi.org/10.2307/1912352. Search in Google Scholar

Little, R.J. 1994. “A class of pattern-mixture models for normal incomplete data.” Biometrika 81: 471–483. DOI: https://doi.org/10.2307/2337120. Search in Google Scholar

Little, R.J., and D.B. Rubin. 2002. Statistical Analysis with Missing Data. John Wiley & Sons, Hoboken, NJ, 2nd edition. Search in Google Scholar

Little, R.J., B.T. West, P. Boonstra, and J. Hu. 2020. “Measures of the degree of departure from ignorable sample selection.” Journal of Survey Statistics and Methodology 8: 932–964. DOI:https://doi.org/10.1093/jssam/smz023. Search in Google Scholar

Mukherjee, B., and N. Chatterjee. 2008. “Exploiting gene-environment independence for analysis of case-control studies: An empirical bayes-type shrinkage estimator to tradeoff between bias and efficiency.” Biometrics 64: 685–694. DOI: https://doi.org/10.1111/j.1541-0420.2007.00953.x. Search in Google Scholar

Nagelkerke, N.J. 1991. “A note on a general definition of the coefficient of determination.” Biometrika 78: 691–692. DOI: https://doi.org/10.1093/biomet/78.3.691. Search in Google Scholar

Nishimura, R., J. Wagner, and M. Elliott. 2016. “Alternative indicators for the risk of non-response bias: a simulation study.” International Statistical Review 84: 43–62. DOI: https://doi.org/10.1111/insr.12100. Search in Google Scholar

Presser, S., and S. McCulloch. 2011. “The growth of survey research in the United States: Government-sponsored surveys, 1984 – 2004.” Social Science Research 40: 1019–1024. DOI: https://doi.org/10.1016/j.ssresearch.2011.04.004. Search in Google Scholar

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Search in Google Scholar

Rubin, D.B. 1976. “Inference and missing data.” Biometrika 63: 581–592. DOI: https://doi.org/10.2307/2335739. Search in Google Scholar

Rubin, D.B. 2004. Multiple imputation for nonresponse in surveys, volume 81. John Wiley & Sons. Search in Google Scholar

Särndal, C.-E., and S. Lundström. 2010. “Design for estimation: Identifying auxiliary vectors to reduce nonresponse bias.” Survey Methodology 36: 131–144. Search in Google Scholar

Schouten, B., F. Cobben, J. Bethlehem, et al. 2009. “Indicators for the representativeness of survey response.” Survey Methodology 35: 101–113. Search in Google Scholar

Van Buuren, S., and K. Groothuis-Oudshoorn. 2011. “mice: Multivariate imputation by chained equations in R.” Journal of Statistical Software 45: 1–67. Search in Google Scholar

Wickham, H. 2017. tidyverse: Easily install and load the ‘tidyverse’. R package version 1.2.1 Search in Google Scholar

Williams, D., and J.M. Brick. 2018. “Trends in US face-to-face household survey nonresponse and level of effort.” Journal of Survey Statistics and Methodology 6: 186–211. DOI: https://doi.org/10.1093/jssam/smx019. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo