Transforming Naturally Occurring Text Data into Economic Statistics: The Case of Online Job Vacancy Postings
Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are "naturally occurring," having originally been posted online by firms. They offer information on two dimensions of vacancies—region and occupation—that firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into standard occupational classification codes. The created vacancy statistics give a plausible, granular picture of UK labour demand and permit the analysis of Beveridge curves and mismatch unemployment at the occupational level.
The views in this work are those of the authors and do not represent the views of the Bank of England or its policy committees. This work was carried out while all of the authors were employed by the Bank of England. We are grateful to Katharine Abraham, James Barker, David Bholat, Emmet Cassidy, Matthew Corder, Daniel Durling, Rodrigo Guimaraes, Frances Hill, Tomas Key, Graham Logan, Michaela Morris, Michael Osbourne, Kate Reinold, Paul Robinson, Ayşegül Şahin, Ben Sole, Vincent Sterk, anonymous reviewers, and conference and seminar participants at the European Economic Association meeting, the American Economic Association meeting, the Royal Statistical Society meeting, the Federal Reserve Board of Governors, the ONS, and the University of Oxford for their comments. We would especially like to thank William Abel and David Bradnum. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.