1. bookVolume 35 (2019): Issue 3 (September 2019)
Journal Details
First Published
01 Oct 2013
Publication timeframe
4 times per year
access type Open Access

Supplementing Small Probability Samples with Nonprobability Samples: A Bayesian Approach

Published Online: 09 Sep 2019
Page range: 653 - 681
Received: 01 Nov 2018
Accepted: 01 Apr 2019
Journal Details
First Published
01 Oct 2013
Publication timeframe
4 times per year

Carefully designed probability-based sample surveys can be prohibitively expensive to conduct. As such, many survey organizations have shifted away from using expensive probability samples in favor of less expensive, but possibly less accurate, nonprobability web samples. However, their lower costs and abundant availability make them a potentially useful supplement to traditional probability-based samples. We examine this notion by proposing a method of supplementing small probability samples with nonprobability samples using Bayesian inference. We consider two semi-conjugate informative prior distributions for linear regression coefficients based on nonprobability samples, one accounting for the distance between maximum likelihood coefficients derived from parallel probability and non-probability samples, and the second depending on the variability and size of the nonprobability sample. The method is evaluated in comparison with a reference prior through simulations and a real-data application involving multiple probability and nonprobability surveys fielded simultaneously using the same questionnaire. We show that the method reduces the variance and mean-squared error (MSE) of coefficient estimates and model-based predictions relative to probability-only samples. Using actual and assumed cost data we also show that the method can yield substantial cost savings (up to 55%) for a fixed MSE.


AAPOR. 2016. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys (9th ed.). American Association for Public Opinion Research. Available at: https://www.aapor.org/AAPOR_Main/media/publications/Standard-Definitions20169theditionfinal.pdf (accessed July 2019).Search in Google Scholar

Ansolabehere, S. and D. Rivers. 2013. “Cooperative Survey Research.” Annual Review of Political Science 16: 307–329. Doi: https://doi.org/10.1146/annurev-polisci-022811-160625.10.1146/annurev-polisci-022811-160625Open DOISearch in Google Scholar

Ansolabehere, S. and B.F. Schaffner. 2014. “Does Survey Mode Still Matter? Findings from a 2010 Multi-Mode Comparison.” Political Analysis 22(3): 285–303. Doi: https://doi.org/10.1093/pan/mpt025.10.1093/pan/mpt025Open DOISearch in Google Scholar

Baker, R., J.M. Brick, N.A. Bates, M. Battaglia, M.P. Couper, J.A. Dever, K.J. Gile, and R. Tourangeau. 2013. Report of the AAPOR Task Force on Non-Probability Sampling. American Association for Public Opinion Research. Available at: https://www.aapor.org/AAPOR_Main/media/MainSiteFiles/NPS_TF_Report_Final_7_revised_FNL_6_22_13.pdf (accessed July 2019).Search in Google Scholar

Blom, A.G., D. Ackermann-Piek, S.C. Helmschrott, C. Cornesse, and J.W. Sakshaug. 2017. “The Representativeness of Online Panels: Coverage, Sampling and Weighting.” Paper Presented at the General Online Research Conference.Search in Google Scholar

Blom, A.G., C. Gathmann, and U. Krieger. 2015. “Setting Up an Online Panel Representative of the General Population: The German Internet Panel.” Field Methods 27(4): 391–408. Doi: https://doi.org/10.1177/1525822X15574494.10.1177/1525822X15574494Open DOISearch in Google Scholar

Blom, A.G., J.M.E. Herzing, C. Cornesse, J.W. Sakshaug, U. Krieger, and D. Bossert. 2016a. “Does the Recruitment of Offline Households Increase the Sample Representativeness of Probability-Based Online Panels? Evidence from the German Internet Panel.” Social Science Computer Review 35(4): 498 – 520. Doi: https://doi.org/10.1177/0894439316651584.10.1177/0894439316651584Open DOISearch in Google Scholar

Blom, A.G., M. Bosnjak, A. Cornilleau, A.-S. Cousteaux, M. Das, S. Douhou and U. Krieger. 2016b. “A Comparison of Four Probability-Based Online and Mixed-Mode Panels in Europe.” Social Science Computer Review 35(1): 8 – 25. Doi: https://doi.org/10.1177/0894439315574825.10.1177/0894439315574825Open DOISearch in Google Scholar

Bosnjak, M., T. Dannwolf, T. Enderle, I. Schaurer, B. Struminskaya, A. Tanner, and K.W. Weyandt. 2017. “Establishing an Open Probability-Based Mixed-Mode Panel of the General Population in Germany: The GESIS Panel.” Social Science Computer Review 36(1): 103–115. Doi: https://doi.org/10.1177/0894439317697949.10.1177/0894439317697949Open DOISearch in Google Scholar

Briggs, D., D. Fecht, and K. De Hoogh. 2007. “Census Data Issues for Epidemiology and Health Risk Assessment: Experiences from the Small Area Health Statistics Unit.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 170(2): 355–378. Doi: https://doi.org/10.1111/j.1467-985X.2006.00467.x.10.1111/j.1467-985X.2006.00467.xOpen DOISearch in Google Scholar

Cacioppo, J.T. and R.E. Petty. 1982. “The Need for Cognition.” Journal of Personality and Social Psychology 42(1): 116. Doi: https://doi.org/10.1037/0022-3514. DOISearch in Google Scholar

Callegaro, M., A. Villar, J. Krosnick, and D. Yeager. 2014. “A Critical Review of Studies Investigating the Quality of Data Obtained with Online Panels.” In Online Panel Research. A Data Quality Perspective, edited by M. Callegaro, R.P. Baker, J. Bethlehem, A.S. Goeritz, J.A. Krosnick, and P.J. Lavrakas, 23–53. Chichester, UK: John Wiley & Sons. Doi: https://doi.org/10.1002/9781118763520.ch2.10.1002/9781118763520.ch2Open DOISearch in Google Scholar

Chang, L. and J.A. Krosnick. 2009. “National Surveys via RDD Telephone Interviewing Versus the Internet Comparing Sample Representativeness and Response Quality.” Public Opinion Quarterly 73(4): 641–678. Doi: https://doi.org/10.1093/poq/nfp075.10.1093/poq/nfp075Open DOISearch in Google Scholar

Digman, J.M. 1990. “Personality Structure: Emergence of the Five-factor Model.” Annual Review of Psychology 41(1): 417–440. Doi: https://doi.org/10.1146/annurev.ps.41.020190.002221.10.1146/annurev.ps.41.020190.002221Open DOISearch in Google Scholar

DiSogra, C., C. Cobb, E. Chan, and J. Dennis. 2012. “Using Probability-Based Online Samples to Calibrate Non-Probability Opt-in Samples.” Presentation at: 67th Annual Conference of the American Association for Public Opinion Research (AAPOR). Available at: http://www.websm.org/uploadi/editor/1361444163DiSogra_et_al_2012_Using_Probability_Based_Online_Samples.ppt (accessed July 2019).Search in Google Scholar

Dutwin, D. and T.D. Buskirk. 2017. “Apples to Oranges or Gala Versus Golden Delicious? Comparing Data Quality of Nonprobability Internet Samples to Low Response Rate Probability Samples.” Public Opinion Quarterly 81(S1): 213–239. Doi: https://doi.org/10.1093/poq/nfw061.10.1093/poq/nfw061Open DOISearch in Google Scholar

Efron, B. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics, 1–26. Doi: https://doi.org/10.1007/978-1-4612-4380-9_41.10.1007/978-1-4612-4380-9_41Open DOISearch in Google Scholar

Elliott, M.N. and A. Haviland. 2007. “Use of a Web-based Convenience Sample to Supplement a Probability Sample.” Survey Methodology 33(2): 211–215. Available at: https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2007002/article/10498-eng.pdf?st=A8NHMZ2v (accessed July 2019).Search in Google Scholar

Elliott, M.R. 2013. “Combining Data from Probability and Non-probability Samples Using Pseudo-weights.” Survey Practice 2(6). Doi: https://doi.org/10.29115/SP-2009-0025.Search in Google Scholar

Erens, B., S. Burkill, M.P. Couper, F. Conrad, S. Clifton, C. Tanton, A. Phelps, J. Datta, C.H. Mercer, P. Sonnenberg, et al. 2014. “Nonprobability Web Surveys to Measure Sexual Behaviors and Attitudes in the General Population: A Comparison with a Probability Sample Interview Survey.” Journal of Medical Internet Research 16(12). Doi: https://doi.org/10.2196/jmir.3382.10.2196/jmir.3382Open DOISearch in Google Scholar

Fahimi, M., F.M. Barlas, W. Gross, and R.K. Thomas. 2014. “Towards a New Math for Nonprobability Sampling Alternatives.” Presented at the 69th Annual Conference of the American Association for Public Opinion Research (AAPOR).Search in Google Scholar

Gelman, A., J.B. Carlin, H.S. Stern, and D.B. Rubin. 2013. Bayesian Data Analysis, Third Edition. Boca Raton, FL, USA: Chapman & Hall/CRC. ISBN: 9781439840955.Search in Google Scholar

Gelman, A., S. Goel, D. Rothschild, and W. Wang. 2016. “High-frequency Polling with Non-representative Data.” In Political Communication in Real Time: Theoretical and Applied Research Approaches (eds. D. Schill, R. Kirk, and A.E. Jasperson). Routledge, 117–133.Search in Google Scholar

Goldberg, L.R. 1993. “The Structure of Phenotypic Personality Traits.” American Psychologist 48(1): 26. Doi: https://doi.org/10.1037/0003-066X. DOISearch in Google Scholar

Herzing, J.M.E. and A.G. Blom. 2019. “The Influence of a Person’s IT Literacy on Unit Nonresponse and Attrition in an Online Panel.” Social Science Computer Review 37(3): 404–424. Doi: https://doi.org/10.1177/0894439318774758.10.1177/0894439318774758Open DOISearch in Google Scholar

Kennedy, C., A. Mercer, S. Keeter, N. Hatley, K. McGeeney, and A. Gimenez. 2016. Evaluating Online Nonprobability Surveys. Vendor Choice Matters; Widespread Errors Found for Estimates Based on Blacks and Hispanics, Pew Research Center. Available at: http://www.pewresearch.org/2016/05/02/evaluatingonline-nonprobability-surveys/ (accessed July 2019).Search in Google Scholar

Lee, S. 2006. “Propensity Score Adjustment as a Weighting Scheme for Volunteer Panel Web Surveys.” Journal of Official Statistics 22(2): 329. Available at: https://www.scb.se/contentassets/f6bcee6f397c4fd68db6452fc9643e68/propensity-score-adjustment-as-a-weighting-scheme-for-volunteer-panel-web-surveys.pdf (accessed July 2019).Search in Google Scholar

Lee, S. and R. Valliant. 2009. “Estimation for Volunteer Panel Web Surveys using Propensity Score Adjustment and Calibration Adjustment.” Sociological Methods & Research 37(3): 319–343. Doi: https://doi.org/10.1177/0049124108329643.10.1177/0049124108329643Open DOISearch in Google Scholar

MacInnis, G., J.A. Krosnick, S. Ho, and M.J. Cho. 2018. “The Accuracy of Measurements with Probability and Nonprobability Survey Samples: Replication and Extension.” Public Opinion Quarterly. Volume 82, Issue 4, 707–744. Doi: https://doi.org/10.1093/poq/nfy038.10.1093/poq/nfy038Open DOISearch in Google Scholar

Malhotra, N. and J.A. Krosnick. 2007. “The Effect of Survey Mode and Sampling on Inferences About Political Attitudes and Behavior: Comparing the 2000 and 2004 ANES to Internet Surveys with Nonprobability Samples.” Political Analysis, 286–323. Doi: https://doi.org/10.1093/pan/mpm003.10.1093/pan/mpm003Open DOISearch in Google Scholar

Marchetti, S., C. Giusti, and M. Pratesi. 2016. “The Use of Twitter Data to Improve Small Area Estimates of Households’ Share of Food Consumption Expenditure in Italy.” AStA Wirtschafts-und Sozialstatistisches Archiv 10(2–3): 79–93. Doi: https://doi.org/10.1007/s11943-016-0190-4.10.1007/s11943-016-0190-4Open DOISearch in Google Scholar

Mercer, A.W., F. Kreuter, S. Keeter, and E.A. Stuart. 2017. “Theory and Practice in Nonprobability Surveys: Parallels between Causal Inference and Survey Inference.” Public Opinion Quarterly 81(S1): 250–271. Doi: https://doi.org/10.1093/poq/nfw060.10.1093/poq/nfw060Open DOISearch in Google Scholar

Pasek, J. 2016. “When Will Nonprobability Surveys Mirror Probability Surveys? Considering Types of Inference and Weighting Strategies as Criteria for Correspondence.” International Journal of Public Opinion Research 28(2): 269–291. Doi: https://doi.org/10.1093/ijpor/edv016.10.1093/ijpor/edv016Open DOISearch in Google Scholar

Pennay, D.W., D. Neiger, P.J. Lavrakas, K.A. Borg, S. Mission, and N. Honey. 2018. “The Online Panels Benchmarking Study: a Total Survey Error Comparison of Findings from Probability-Based Surveys and Nonprobability Online Panel Surveys in Australia.” Australian National University, Centre for Social Research and Methods Paper NO. 2/2018. Available at: http://csrm.cass.anu.edu.au/sites/default/files/docs/2018/12/CSRM_MP2_2018_ONLINE_PANELS.pdf (accessed July 2019).Search in Google Scholar

Porter, A.T., S.H. Holan, C.K. Wikle, and N. Cressie. 2014. “Spatial Fay-Herriot Models for Small Area Estimation with Functional Covariates.” Spatial Statistics 10: 27–42. Doi: https://doi.org/10.1016/j.spasta.2014. DOISearch in Google Scholar

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.r-project.org/ (accessed July 2019).Search in Google Scholar

Rao, J.N. 2003. Small-Area Estimation. Wiley Online Library. Doi: https://doi.org/10.1002/0471722189.10.1002/0471722189Open DOISearch in Google Scholar

Rivers, D. 2007. “Sampling for Web Surveys.” Presented at the Joint Statistical Meetings. Available at: https://pdfs.semanticscholar.org/fffa/a7e52c5d163a0944974a68160ee6e0a6b481.pdf (accessed July 2019).Search in Google Scholar

Rivers, D. and D. Bailey. 2009. “Inference from Matched Samples in the 2008 US National Elections.” In Proceedings of the Joint Statistical Meetings, Volume 1, 627–639. Palo Alto, CA: YouGov/Polimetrix. Available at: https://pdfs.semanticscholar.org/e566/fb48f88ae34640b729387cbd4006249f8c45.pdf (accessed July 2019).Search in Google Scholar

Schmertmann, C.P., S.M. Cavenaghi, R.M. Assunção, and J.E. Potter. 2013. “Bayes Plus Brass: Estimating Total Fertility for Many Small Areas from Sparse Census Data.” Population Studies 67(3): 255 – 273. Doi: https://doi.org/10.1080/00324728.2013.795602.10.1080/00324728.2013.795602Open DOISearch in Google Scholar

Spiegelhalter, D., A. Thomas, N. Best, and D. Lunn. 2007. OpenBUGS user manual, version 3.0.2. MRC Biostatistics Unit, Cambridge.Search in Google Scholar

Sturtz, S., U. Ligges, A. Gelman, et al. 2005. “R2WinBUGS: A Package for Running WinBUGS from R.” Journal of Statistical Software 12(3): 1 – 16. Doi: https://doi.org/10.18637/jss.v012.i03.Search in Google Scholar

Tourangeau, R. and T. Plewes. 2013. Nonresponse in Social Science Surveys: A Research Agenda. National Academies Press. Doi: https://doi.org/10.17226/18293.Search in Google Scholar

Valliant, R. and J.A. Dever. 2011. “Estimating Propensity Adjustments for Volunteer Web Surveys.” Sociological Methods & Research 40(1): 105 – 137. Doi: https://doi.org/10.1177/0049124110392533.10.1177/0049124110392533Open DOISearch in Google Scholar

Wang, W., D. Rothschild, S. Goel, and A. Gelman. 2015. “Forecasting Elections with Non-representative Polls.” International Journal of Forecasting 31(3): 980–991. Doi: https://doi.org/10.1016/j.ijforecast.2014. DOISearch in Google Scholar

Yeager, D.S., J.A. Krosnick, L. Chang, H.S. Javitz, M.S. Levendusky, A. Simpser, and R. Wang. 2011. “Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-probability Samples.” Public Opinion Quarterly 75(1): 709–747. Doi: https://doi.org/10.1093/poq/nfr020.10.1093/poq/nfr020Open DOISearch in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo