The object of the study is context-aware phrase representations. The growing need to automate candidate recruitment and job recommendation processes has paved the way for the utilization of text embeddings. These embeddings involve translating the semantic essence of text into a continuous, high-dimensional vector space. By learning context-aware rich and meaningful representations of phrases within the human resource domain, the efficacy of similarity searches and matching procedures is enhanced, which contributes to a more streamlined and effective recruitment process. However, existing approaches do not take into account the context when modeling phrases. This necessitates the improvement of information technology analysis in this area. In this paper, it is proposed to mark the beginning and end of phrases in the text using special tokens. This made it possible to reduce the requirements for computing power by calculating all phrase representations present in the text simultaneously. The effectiveness of the improvement was tested on a new dataset to compare and evaluate the models in the task of modeling phrases in the field of human resources management. The proposed approach to modeling phrase representations with regard to context in the field of human resources management leads to an improvement in computational efficiency by up to 50 % and an increase in accuracy by up to 10 %. The architecture of the machine learning model for creating context-aware phrase representations is developed, which is characterized by the presence of blocks for taking into account phrase boundaries. Experiments and comparisons with existing approaches have confirmed the effectiveness of the proposed solution. In practice, the proposed information analysis technology can be used to automate the process of identifying and normalizing candidates' skills in online recruiting
Read full abstract