1. bookVolume 38 (2022): Edizione 2 (June 2022)
Dettagli della rivista
Prima pubblicazione
01 Oct 2013
Frequenza di pubblicazione
4 volte all'anno
access type Accesso libero

Iterative Kernel Density Estimation Applied to Grouped Data: Estimating Poverty and Inequality Indicators from the German Microcensus

Pubblicato online: 14 Jun 2022
Volume & Edizione: Volume 38 (2022) - Edizione 2 (June 2022)
Pagine: 599 - 635
Ricevuto: 01 Jun 2020
Accettato: 01 Feb 2022
Dettagli della rivista
Prima pubblicazione
01 Oct 2013
Frequenza di pubblicazione
4 volte all'anno

The estimation of poverty and inequality indicators based on survey data is trivial as long as the variable of interest (e.g., income or consumption) is measured on a metric scale. However, estimation is not directly possible, using standard formulas, when the income variable is grouped due to confidentiality constraints or in order to decrease item nonresponse. We propose an iterative kernel density algorithm that generates metric pseudo samples from the grouped variable for the estimation of indicators. The corresponding standard errors are estimated by a non-parametric bootstrap that accounts for the additional uncertainty due to the grouping. The algorithm enables the use of survey weights and household equivalence scales. The proposed method is applied to the German Microcensus for estimating the regional distribution of poverty and inequality in Germany.


Abraham, K., and S. Houseman. 1995. “Earnings inequality in Germany.” Differences and Changes in Wage Structures, edited by R.B. Freeman and L.F. Katz: 371–404. Chicago: Nber Comparative Labor Markets. Search in Google Scholar

Alfons, A., and M. Templ. 2013. “Estimation of social exclusion indicators from complex surveys: the R package laeken.” Journal of Statistical Software 54(15): 1–25. DOI: https://doi.org/10.18637/jss.v054.i15.10.18637/jss.v054.i15 Search in Google Scholar

Australian Bureau of Statistics. 2011. Census household form. DOI: https://unstats.un.org/unsd/demographic/sources/census/quest/AUS2011en.pdf (accessed April 2018). Search in Google Scholar

Bandourian, R., J. McDonald, and R.S. Turley. 2002. A comparison of parametric models of income distribution across countries and over time. Technical report, Luxembourg Income Study. Available at: http://www.lisdatacenter.org/wps/liswps/305.pdf.10.2139/ssrn.324900 Search in Google Scholar

Betensky, R.A., J. Lindsey, L. Ryan, and M. Wand. 1999. “Local EM estimation of the hazard function for interval-censored data.” Biometrics 55: 238–245. DOI: https://doi.org/10.1111Zj.0006-341X.1999.00238.x.10.1111/j.0006-341X.1999.00238.x11318161 Search in Google Scholar

Boehle, M. 2015. Armutsmessung mit dem Mikrozensus: Methodische Aspekte und Umsetzung für Querschnitts- und Trendanalysen. Technical report, Gesis Leibniz-Institut fur Sozialwissenschaften. Available at: https://www.ssoar.info/ssoar/handle/-document/45724.2. Search in Google Scholar

Braun, J., T. Duchesne, and J. Stafford. 2005. “Local likelihood density estimation for interval censored data.” Canadian Journal of Statistics 33: 39–60. DOI: https://doi.org/10.1002/cjs.5540330104.10.1002/cjs.5540330104 Search in Google Scholar

Buskirk, T. and S.L. Lohr. 2005. “Asymptotic properties of kernel density estimation with complex survey data.” Journal of Statistical Planning and Inference 128: 165–190. DOI: https://doi.org/10.1016/j.jspi.2003. Search in Google Scholar

Celeux, G., D. Chauveau, and J. Diebolt. 1996. “Stochastic versions of the EM algorithm: an experimental study in the mixture case.” Journal of Statistical Computation and Simulation 55(4): 287–314. DOI: https://doi.org/10.1080/00949659608811772.10.1080/00949659608811772 Search in Google Scholar

Celeux, G. and J. Diebolt. 1985. “The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem.” Computational Statistics Quarterly 2: 73–82. Available at. https://www.researchgate.net/publication/229100768_The_SEM_. Search in Google Scholar

Chen, Y.T. 2017. “A unified approach to estimating and testing income distributions with grouped data.” Journal of Business & Economic Statistics 36(3): 1–18. DOI: https://doi.org/10.1080/07350015.2016.1194762.10.1080/07350015.2016.1194762 Search in Google Scholar

Dagum, C. 1977. “A new model of personal income distribution: specification and estimation.” Economie Appliquee 30: 413–437. Available at: https://ideas.repec.org/h/spr/esichp/978-0-387-72796-7_1.html. Search in Google Scholar

Dempster, A., N. Laird, and D. Rubin. 1977. “Maximum likelihood from incomplete data via the EM algorithm.” Journal of the Royal Statistical Society. Series B 39(1): 1–38. DOI: https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.10.1111/j.2517-6161.1977.tb01600.x Search in Google Scholar

Departamento Administrativo Nacional De Estadistica. 2005. Censo general 2005. DOI: https://www.dane.gov.co/files/censos/libroCenso2005nacional.pdf? (accessed April 2018). Search in Google Scholar

Deville, J. 1999. “Variance estimation for complex statistics and estimators: linearization and residual techniques.” Survey Methodology 25(2): 193-203. Search in Google Scholar

Dorfman, A.H., and R. Valliant. 2005. “Superpopulation models in survey sampling.” Encyclopedia of Biostatistics 8. DOI: https://doi.org/10.1002/0470011815.b2a16076.10.1002/0470011815.b2a16076 Search in Google Scholar

Efron, B. 1979. “Bootstrap methods: another look at the jackknife.” The Annals of Statistics 7(1): 1–26. DOI: https://doi.org/10.1214/aos/1176344552.10.1214/aos/1176344552 Search in Google Scholar

Eurostat. 2014. Statistics explained: at-risk-of-poverty rate. DOI: http://ec.europa.eu/eurosta/statistics-explained/index.php/Glossary:At-risk-of-poverty_rate. Accessed: 2018-05-30. Search in Google Scholar

Eurostat. 2018. At-risk-of-poverty rate by poverty therreshold. DOI: http://appsso.eurostat.ec.europa.eu/nui/submitViewTableAction.do. Accessed: 2018-12-30. Search in Google Scholar

Field, C.A., and A.H. Welsh. 2007. “Bootstrapping clustered data.” Journal of the Royal Statistical Society: 69(3): 369–390. DOI: https://doi.org/10.1111/j.1467-9868.2007.00593.x.10.1111/j.1467-9868.2007.00593.x Search in Google Scholar

Foster, J., J. Greer, and E. Thorbecke. 1984. A class of decomposable poverty measures. Econometrica 52(3): 761–766. DOI: https://doi.org/10.2307/1913475.10.2307/1913475 Search in Google Scholar

Fréchet, M. 1927. “Sur la loi de probabilité de l’écart maximum.” Annales de la Societe Polonaise de Mathe-matique 6: 92–116. Search in Google Scholar

Fuchs-Schündeln, N., D. Krueger, and M. Sommer. 2010. “Inequality trends for Germany in the last two decades: a tale of two countries.” Review of Economic Dynamics 13(1): 103–132. DOI: https://doi.org/10.1016/j.red.2009. Search in Google Scholar

Gini, C. 1912. Variabilità e mutabilità: contributo allo studio delle distribuzioni e delle relazioni statistiche. Studi economico-giuridici pubblicati per cura della facoltà di Giurisprudenza della R. Università di Cagliari. Bologna: Tipogr. di P. Cuppini. Search in Google Scholar

Graf, M., and D. Nedyalkova. 2014. “Modeling of income and indicators of poverty and social exclusion using the generalized beta distribution of the second kind.” Review of Income and Wealth 60(4): 821–842. DOI: https://doi.org/10.1111/roiw.12031.10.1111/roiw.12031 Search in Google Scholar

Groß, M., and U. Rendtel. 2016. “Kernel density estimation for heaped data.” Journal of Survey Statistics and Methodology 4(3): 339–361. DOI: https://doi.org/10.1093/jssam/smw011.10.1093/jssam/smw011 Search in Google Scholar

Groß, M., U. Rendtel, T. Schmid, S. Schmon, and N. Tzavidis. 2017. “Estimating the density of ethnic minorities and aged people in Berlin: multivariate kernel density estimation applied to sensitive georeferenced administrative data protected via measurement error.” Journal of the Royal Statistical Society 180(1): 161–183. DOI: https://doi.org/10.1111/rssa.12179.10.1111/rssa.12179 Search in Google Scholar

Hagenaars, A., and K.D. Vos. 1988. “The definition and measurement of poverty.” Journal of Human Resources 23(2): 211–221. DOI: https://doi.org/10.2307/145776.10.2307/145776 Search in Google Scholar

Hall, P. 1982. “The influence of rounding errors on some nonparametric estimators of a density and its derivatives.” SIAM Journal on Applied Mathematics 42(2): 390–399. DOI: https://doi.org/10.1137/0142030.10.1137/0142030 Search in Google Scholar

Hall, P., and M.P. Wand. 1996. “On the accuracy of binned kernel density estimators.” Journal of Multivariate Analysis 56(2): 165–184. DOI: https://doi.org/10.1006/jmva.1996.0009.10.1006/jmva.1996.0009 Search in Google Scholar

Henderson, D.J. and C.F. Parmeter. 2015. Applied Nonparametric Econometrics. New York: Cambridge University Press.10.1017/CBO9780511845765 Search in Google Scholar

Information und Technik (NRW). 2009. Berechnung von Armutsgefährdungsquoten auf Basis des Mikrozensus DOI: http://www.amtliche-sozialberichterstattung.de/pdf/Berechnung%20von%20Armutsgefaehrdungsquoten_090518.pdf. (accessed April 2018). Search in Google Scholar

Jones, M.C., J.S. Marron, and S.J. Sheather. 1996. “A brief survey of bandwidth selection for density estimation.” Journal of the American Statistical Association 91(433): 401–407. DOI: https://doi.org/10.1080/01621459.1996.10476701.10.1080/01621459.1996.10476701 Search in Google Scholar

Kakwani, N.C., and N. Podder. 2008. “Efficient estimation of the Lorenz curve and associated inequality measures from grouped observations Lorenz curve and associated inequality measures from grouped observations.” In Modeling Income Distributions and Lorenz Curves, edited by D. Chotikapanich: 57–70. New York: Springer.10.1007/978-0-387-72796-7_4 Search in Google Scholar

Kleiber, C. 2008. “A guide to the Dagum distributions Lorenz curve and associated inequality measures from grouped observations. In Modelig Income Distributions and Lorenz Curves, edited by D. Chotikapanich: 97–117. New York: Springer.10.1007/978-0-387-72796-7_6 Search in Google Scholar

Lenau, S., and R. Münnich. 2016. Estimating income poverty and inequality from income classes. Technical report, InGRID Integrating Expertise in Inclusive Growth: Case Studies. Search in Google Scholar

Li, L., T. Watkins, and Q. Yu. 1997. “An EM algorithm for smoothing the self-consistent estimator of survival functions with interval-censored data.” Scandinavian Journal of Statistics 24: 531–542. DOI: https://doi.org/10.1111/1467-9469.00079.10.1111/1467-9469.00079 Search in Google Scholar

Loader, C.R. 1999. “Bandwidth selection: classical or plug-in?” Annals of Statistics 27(2): 415–438. DOI: https://doi.org/10.1214/aos/1018031201.10.1214/aos/1018031201 Search in Google Scholar

Lok-Dessallien, R. 1999. Review of poverty concepts and indicators. Technical report, United Nations Development Programme. Available at: http://mirror.unpad.ac.id/orari/library/library-ref-ind/ref-ind-1/application/poverty-. Search in Google Scholar

Mashreghi, Z., D. Haziza, and C. Leger. 2016. “A survey of bootstrap methods in finite population sampling.” Statistics Surveys 10: 1–52. DOI: https://doi.org/10.1214/16-SS113.10.1214/16-SS113 Search in Google Scholar

McDonald, J.B. 1984. “Some generalized functions for the size distribution of income.” Econometrica 52(3): 647–663. DOI: https://doi.org/10.2307/1913469.10.2307/1913469 Search in Google Scholar

McDonald, J.B., and Y.J. Xu. 1995. “A generalization of the beta distribution with applications.” Journal of Econometrics 66(1): 133–152. DOI: https://doi.org/10.1016/0304-4076(94)01612-4.10.1016/0304-4076(94)01612-4 Search in Google Scholar

McLachlan, G., and T. Krishnan. 2008. The EM Algorithm and Extensions. New York: Wiley.10.1002/9780470191613 Search in Google Scholar

Micklewright, J., and S. Schnepf. 2010. “How reliable are income data collected with a single question?” Journal of the Royal Statistical Society: 173(2): 409–429. DOI: https://doi.org/10.1111/j.1467-985X.2009.00632.x.10.1111/j.1467-985X.2009.00632.x Search in Google Scholar

Moore, J.C., and E.J. Welniak. 2000. “Income Measurement Error in Surveys: a Review.” Journal of Official Statistics 16(4): 331. Available at: https://www.scb.se/contentas-sets/ca21efb41fee47d293bbee5bf7be7fb3/income-measurement-error-in-surveys-a-review.pdf (accessed March 2022). Search in Google Scholar

Nielsen, S.F. 2000. “The stochastic EM algorithm: estimation and asymptotic results.” Bernoulli 6(3): 457–489. DOI: https://doi.org/10.2307/3318671.10.2307/3318671 Search in Google Scholar

OECD. 2018. Oecd data, income inequality. Available at: DOI: https://data.oecd.org/inequality/income-inequality.htm (accessed December 2018). Search in Google Scholar

Osier, G. 2009. “Variance estimation for complex indicators of poverty and inequality using linearization techniques.” Survey Research Methods 3(3): 167–195. DOI: https://doi.org/10.18148/srm/2009.v3i3.369. Search in Google Scholar

Pan, W. 2000. “Smooth estimation of the survival function for interval censored data.” Statistics in Medicine 19: 2611–2624. DOI: https://doi.org/10.1002/1097-0258(20001015)19:19, 2 611:aid-sim538.3.0.co;2-o. Search in Google Scholar

Parzen, E. 1962. “On estimation of a probability density function and mode.” The Annals of Mathematical Statistics 33(3): 1065–1076. DOI: https://doi.org/10.1214/aoms/1177704472.10.1214/aoms/1177704472 Search in Google Scholar

Pfeffermann, D., A.M. Krieger, and Y. Rinott. 1998. “Parametric distributions of complex survey data under informative probability sampling.” Statistica Sinica 8(4): 1087–1114. Available at: https://pluto.huji.ac.il/~rinott/publications/PfKRR.pdf. Search in Google Scholar

Pfeffermann, D., and M. Sverchkov. 1999. “Parametric and semi-parametric estimation of regression models fitted to survey data.” Sankhya: The Indian Journal of Statistics, 61(1): 166–186. Available at: https://www.jstor.org/stable/25053074. Search in Google Scholar

Reed, W.J., and F. Wu. 2008. “New four- and five-parameter models for income distributions.” In Modeling Income Distributions and Lorenz Curves, edited by D. Chotikapanich: 211–224. New York: Springer.10.1007/978-0-387-72796-7_11 Search in Google Scholar

Rosenblatt, M. 1956. “Remarks on some nonparametric estimates of a density function.” The Annals of Mathematical Statistics 27(3): 832–837. DOI: https://doi.org/10.1214/aoms/1177728190.10.1214/aoms/1177728190 Search in Google Scholar

Schimpl-Neimanns, B. 2010. Varianzschaetzung fuer Mikrozensus Scientific Use Files ab 2005, GESIS-Technical Reports 3. Mannheim: GESIS-Leibniz-Institut fuer Sozialwissenschaften. Available at: https://pluto.huji.ac.il/~rinott/publications/PfKRR.pdf https://www.jstor.org/stable/25053074. Search in Google Scholar

Schwarz, N. 2001. “The German Microcensus.” Schmollers Jahrbuch 132(1): 1–26. DOI: https://doi.org/10.3790/schm. Search in Google Scholar

Scott, D.W., and S.J. Sheather. 1985. “Kernel density estimation with binned data.” Communications in Statistics – Theory and Methods 14(6): 1353–1359. DOI: https://doi.org/10.1080/03610928508828980.10.1080/03610928508828980 Search in Google Scholar

Shao, J., and D. Tu. 1995. The Jackknife and Bootstrap. New York: Springer.10.1007/978-1-4612-0795-5 Search in Google Scholar

Singh, S., and G. Maddala. 1976. “A function for the size distribution of incomes.” Econometrica 44(5): 963–970. DOI: https://doi.org/10.2307/1911538.10.2307/1911538 Search in Google Scholar

Stacy, E. 1962. “A generalization of the gamma distribution.” The Annals of Mathematical Statistics 33: 1187–1192. DOI: https://doi.org/10.1214/aoms/1177704481.10.1214/aoms/1177704481 Search in Google Scholar

Statistical Offices of the Federation and the Federal States. 2016. Data supply: Microcensus. Available at: http://www.forschungsdatenzentrum.de/en/database/microcensus/index.asp. (accessed June 2018). Search in Google Scholar

Statistics New Zealand. 2013. New Zealand census of population and dwellings. Available at: DOI: https://unstats.un.org/unsd/demographic/sources/census/quest/NZL2013enIn.pdf (accessed May 2018). Search in Google Scholar

Statistisches Bundesamt. 2017. Datenhandbuch zum Mikrozensus Scientific-Use-File 2012. Available at: http://www.forschungsdatenzentrum.de/bestand/mikrozensus/suf/2012/fdz_mz_suf_2012_schluesselverzeichnis.pdf. (accessed: July 2017). Search in Google Scholar

Statistisches Bundesamt. 2018a. Der Mikrozensus stellt sich vor. Available at: DOI: https://www.destatis.de/DE/ZahlenFakten/GesellschaftStaat/Bevoelkerung/Mikrozensus.html. (accessed September 2018). Search in Google Scholar

Statistisches Bundesamt. 2018b. Microcensus. Available at: DOI: https://www.destatis.-de/EN/FactsFigures/SocietyState/Population/HouseholdsFamilies/Methods/Microcensus.html. (accessed June 2018). Search in Google Scholar

Stauder, J., and W. Hüning. 2004. Die Messung von Äquivalenzeinkommen und Armutsquoten auf der Basis des Mikrozensus. Technical report, Statistische Analysen und Studien NRW. Available at: https://www.gesis.org/fileadmin/upload/institut/wiss_arbeitsbereiche. Search in Google Scholar

Tepping, B. 1968. “Variance estimation in complex surveys.” Proceedings of the American Statistical Association Social Statistics Section: 11–18. Available at: http://www.asasrms.org/Proceedings/y1968/Variance%20Estimation%20In%20Complex%20Surveys. Search in Google Scholar

Tille, Y. 2001. Theorie des sondages: Echantillonnage et estimation en populations finies. Paris: Dunod. Search in Google Scholar

Walter, P. 2021. “The R package smicd: Statistical methods for interval- censored data”. The R Journal 13(1): 396–412. DOI: https://doi.org/10.32614/RJ-2021-045.10.32614/RJ-2021-045 Search in Google Scholar

Walter, P., M. Groß, T. Schmid, and N. Tzavidis. 2021. “Domain prediction with grouped income data.” Journal of the Royal Statistical Society 184(4): 1501–1523. DOI: https://doi.org/10.1111/rssa.12736.10.1111/rssa.12736 Search in Google Scholar

Wang, B., and M. Wertelecki. 2013. “Density estimation for data with rounding errors.” Computational Statistics & Data Analysis 65: 4–12. DOI: https://doi.org/10.1016Zj.csda.2012. Search in Google Scholar

Wolter, K. 1985. Introduction to Variance Estimation. New York: Springer. Search in Google Scholar

Woodruff, R.S. 1971. “A simple method for approximating the variance of a complicated estimate.” Journal of the American Statistical Association 66(334): 411–414. DOI: https://doi.org/10.1080/01621459.1971.10482279.10.1080/01621459.1971.10482279 Search in Google Scholar

World Economic Forum. 2017. Global risks 2017. Available at: http://reports.weforum.org/global-risks-2017/part-1-global-risks-2017/ (accessed September 2017). Search in Google Scholar

Articoli consigliati da Trend MD

Pianifica la tua conferenza remota con Sciendo