1. bookVolume 37 (2021): Issue 3 (September 2021)
    Special Issue on Population Statistics for the 21st Century
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
access type Open Access

A General Framework for Multiple-Recapture Estimation that Incorporates Linkage Error Correction

Published Online: 13 Sep 2021
Page range: 699 - 718
Received: 01 Jun 2019
Accepted: 01 Nov 2020
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
Abstract

The size of a partly observed population is often estimated with the capture-recapture model. An important assumption of this chat model is that sources can be perfectly linked. This assumption is of relevance if the identification of records is not obtained by some perfect identifier (such as an id code) but by indirect identifiers (such as name and address). In that case, the perfect linkage assumption is often violated, which in general leads to biased population size estimates. Initial suggestions to solve this use record linkage probabilities to correct the capture-recapture model. In this article we provide a general framework, based on the standard log-linear modelling approach, that generalises this work towards the inclusion of additional sources and covariates. We show that the method performs well in a simulation study.

Keywords

Bishop Y.M.M., S.E. Fienberg, and P.W. Holland. 1975. Discrete Multivariate Analysis: Theory and Practice. MIT Press: Cambridge, Mass. Search in Google Scholar

Cadwell, B.L., P.J. Smith, and A.L. Baughman. 2005. “Methods for capture-recapture analysis when cases lack personal identifiers.” Statistics in Medicine, 24(13): 2041–2051. DOI: https://doi.org/10.1002/sim.2081. Search in Google Scholar

Chao, A. 2001. “An Overview of Closed Capture-Recapture Models.” Journal of Agricultural, Biological, and Environmental Statistics 6: 158–175. DOI: https://doi.org/10.1198/108571101750524670. Search in Google Scholar

Chapman, D.G. 1951. Some properties of the hypergeometric distribution with applications to zoological sample censuses. Berkeley, University of California Press. Search in Google Scholar

Chatterjee, K., and D. Mukherjee. 2018. “A new integrated likelihood for estimating population size in dependent dual-record system.” Can J Statistics 46: 577–592. DOI: http://doi.org/10.1002/cjs.11477. Search in Google Scholar

Chen, Q., and D.E. Giles. 2009. Finite-Sample Properties of the Maximum Likelihood Estimator for the Poisson Regression Model With Random Covariates. Econometrics Working Paper EWP0907, University of Victoria. Search in Google Scholar

Chen, Z., and L. Kuo. 2001.“A Note on the Estimation of the Multinomial Logit Model with Random Effects.” The American Statistician 55: 89–95. DOI: https://doi.org/10.1198/000313001750358545. Search in Google Scholar

Cormack, R.M. 1989. “Log-linear models for capture-recapture.” Biometrics, 45: 395–413. DOI: https://doi.org/10.2307/2531485. Search in Google Scholar

De Wolf, PP., J. van Der Laan, and D. Zult. 2019. “Connecting Correction Methods for Linkage Error in Capture-Recapture”. Journal of Official Statistics. 35 (3): 577–597. DOI: https://doi.org/10.2478/jos-2019-0024. Search in Google Scholar

Di Consiglio, L., and T. Tuoto. 2015. “Coverage evaluation on probabilistically linked data.” Journal of Official Statistics, 31: 415–429. DOI: https://doi.org/10.1515/jos-2015-0025. Search in Google Scholar

Di Consiglio, L., and T. Tuoto. 2018. “Population Size Estimation and Linkage Errors: the Multiple Lists Case.” Journal of Official Statistics. 34 (4): 889–908. DOI: https://doi.org/10.2478/jos-2018-0044. Search in Google Scholar

Ding, Y., and S.E. Fienberg. 1994. “Dual system estimation of Census undercount in the presence of matching error.” Survey Methodology 20: 149–158. Available at: https://www150.statcan.gc.ca/n1/en/pub/12-001-x/1994002/article/14422-eng.pdf. Search in Google Scholar

Fellegi, I.P., and A.B. Sunter. 1969. “A Theory for Record Linkage.” Journal of the American Statistical Association 64: 1183–1210. DOI: https://doi.org/10.1080/01621459.1969.10501049. Search in Google Scholar

Fienberg, S.E. 1972. “The multiple recapture census for closed populations and incomplete contingency tables.” Biometrika, 59(3): 591–603. DOI: https://doi.org/10.1093/biomet/59.3.591. Search in Google Scholar

Gerritse, S.C., B.F.M. Bakker, and P.G.M. van der Heijden. 2017. “The impact of linkage errors and erroneous captures on the population size estimator due to implied coverage.” Discussion paper 2017–16, Statistics Netherlands, The Hague/Heerlen. Available at: https://www.cbs.nl/en-gb/background/2017/39/impact-of-linkage-errors-and-erroneous-captures (accessed 2016). Search in Google Scholar

IWGDMF (International Working Group for Disease Monitoring and Forecasting). 1995. “Capture-recapture and multiple-record systems estimation I: history and theoretical development.” American Journal of Epidemiology; 142: 1047–1058. DOI: https://doi.org/10.1093/oxfordjournals.aje.a117558. Search in Google Scholar

Jaro, M. 1989. “Advances in Record Linkage Methodology as Applied to Matching the 1985 Test Census of Tampa, Florida.” Journal of American Statistical Association 84: 414–420. DOI: https://doi.org/10.1080/01621459.1989.10478785. Search in Google Scholar

McLeod, P., D. Heasman, and I. Forbes. 2011. “Simulated data for the on the job training.” Essnet DI. Available at: http://www.cros-portal.eu/content/job-training (accessed 2017). Search in Google Scholar

Lincoln, F.C. 1930. Calculating Waterfowl Abundance on the Basis of Banding Returns, U.S. Dept. Agric., 118: 1–4. Available at: https://openlibrary.org/books/OL14861353M/Calculating_waterfowl_abundance_on_the_basis_of_banding_returns. Search in Google Scholar

Menkens, G.E., and S.H. Anderson Jr. 1988. “Estimation of Small-Mammal Population Size.” Ecology 69 (6): 1952–1959. Search in Google Scholar

Petersen, C.G.J. 1896. The yearly immigration of young plaice into the Limfiord from the German Sea. Report of the Danish Biological Station 6: 5–84. DOI: https://doi.org/10.2307/1941172. Search in Google Scholar

Winkler, W.E. 1988. “Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage.” Section on Survey Research Methods: 667–671. DOI: https://courses.cs.washington.edu/courses/cse590q/04au/papers/WinklerEM.pdf. Search in Google Scholar

Wolter, K.M. 1986. “Some coverage error models for census data.” Journal of the American Statistical Association 81: 338–346. DOI: https://doi.org/10.1080/01621459.1986.10478277. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo