Transforming Naturally Occurring Text Data Into Economic Statistics: The Case of Online Job Vacancy Postings
Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are ‘naturally occurring’, having originally been posted online by firms. They offer information on two dimensions of vacancies—region and occupation—that firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into standard occupational classification codes. The created vacancy statistics give a plausible, granular picture of UK labour demand and permit the analysis of Beveridge curves and mismatch unemployment at the occupational level.
The authors would like to thank Katherine Abraham, James Barker, David Bholat, David Bradnum, Emmet Cassidy, Matthew Corder, Rodrigo Guimaraes, Frances Hill, Tomas Key, Graham Logan, Michaela Morris, Michael Osbourne, Kate Reinold, Paul Robinson, Ayşegül Şahin, Ben Sole, Vincent Sterk, and the participants of the NBER Conference on Big Data in the 21st Century. We would especially like to thank William Abel for his help throughout the project. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research or the Bank of England. This work was funded by the Bank of England.
Transforming Naturally Occurring Text Data into Economic Statistics: The Case of Online Job Vacancy Postings, Arthur Turrell, Bradley Speigner, Jyldyz Djumalieva, David Copple, James Thurgood. in Big Data for Twenty-First-Century Economic Statistics, Abraham, Jarmin, Moyer, and Shapiro. 2022