Transforming Naturally Occurring Text Data into Economic Statistics: The Case of Online Job Vacancy Postings
Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are "naturally occurring," having originally been posted online by firms. They offer information on two dimensions of vacancies—region and occupation—that firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into standard occupational classification codes. The created vacancy statistics give a plausible, granular picture of UK labour demand and permit the analysis of Beveridge curves and mismatch unemployment at the occupational level.