Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data

This paper studies the relationship between a vacancy population obtained from web crawling and vacancies in the economy inferred by a National Statistics Office (NSO) using a traditional method. We compare the time series properties of samples obtained between 2007 and 2014 by Statistics Netherlands and by a web scraping company. We find that the web and NSO vacancy data present similar time series properties, suggesting that both time series are generated by the same underlying phenomenon: the real number of new vacancies in the economy. We conclude that, in our case study, web-sourced data are able to capture aggregate economic activity in the labor market.

eISSN:: 2193-8997
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Business and Economics, Political Economics, Microeconomics, Macroecomics, Economic Policy, Mathematics and Statistics for Economists, Econometrics

Journal RSS Feed

Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data

Published Online: Sep 13, 2019

Page range: -

DOI: https://doi.org/10.2478/izajole-2019-0004

Keywords
web crawling, statistical inference, time series, vacancies, Labor demand, data collection

© 2019 Pablo de Pedraza, Stefano Visintin, Kea Tijdens and Gábor Kismihók, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 Public License.

Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data

Published Online: Sep 13, 2019

Page range: -

DOI: https://doi.org/10.2478/izajole-2019-0004

Keywordsweb crawling, statistical inference, time series, vacancies, Labor demand, data collection

© 2019 Pablo de Pedraza, Stefano Visintin, Kea Tijdens and Gábor Kismihók, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 Public License.

Keywords
web crawling, statistical inference, time series, vacancies, Labor demand, data collection