Abstract

In this paper, we compare different methods to extract skill demand from the text of job descriptions. We propose the fraction of wage variation explained by the extracted skills as a novel performance metric for the comparison of methods. Using this, we compare the performance of the word-counting method with three different dictionaries and that of three unsupervised topic-modeling techniques, the LDA, the PLSA and the BERTopic. We apply these methods to a U.K. job board dataset of 1,158,926 job advertisements from 35 industries collected in 2018. We find that each of the dictionary-based methods explain about 20% of the wage variation across jobs. The topic modeling techniques perform better as the PLSA is able to explain 36.5% of the wage variation, while BERTopic 32.6%. The best performing method is the LDA with 48.3% of the wage variation explained. Its disadvantage, however, is in the difficulty of interpretation of the skills extracted.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call