Machine Learning on Wikipedia Text for the Automatic Identification of Vocational Domains of Significance for Displaced Communities

Maria Nefeli Nikiforos,Adamantia Pateli,Konstantina Deliveri,Katia Lida Kermanidis

doi:10.1109/smap56125.2022.9941803

Abstract

Despite their educational level and professional qualifications, an important percentage of highly-skilled migrants and refugees find employment in low-skill vocations throughout the world. Typical vocational domains include agriculture, cooking, crafting, construction, and hospitality. As a first step towards developing an educational tool for helping such underprivileged communities become acquainted with the sublanguage of their vocational domain in their host country, automatic domain identification among the aforementioned domains was attempted in this paper, using domain-specific textual data. Wikis and social networks provide a valuable data source for data mining, Natural Language Processing and machine learning tasks. Wikipedia articles, in regard to these domains, were collected and processed in order to create a novel text data set. Extracted linguistic features were used in the experiments with Random Forest combined with Adaboost, and Gradient Boosted Trees. The machine learning models achieved high performance in vocational domain identification (up to 99.93% accuracy).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine Learning on Wikipedia Text for the Automatic Identification of Vocational Domains of Significance for Displaced Communities

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Vocational Domain Identification with Machine Learning and Natural Language Processing on Wikipedia Text: Error Analysis and Class Balancing
Maria Nefeli Nikiforos ... Konstantina Deliveri
Computers | VOL. 12
Maria Nefeli Nikiforos, et. al.Maria Nefeli Nikiforos ... Konstantina Deliveri
24 May 2023
Computers | VOL. 12

Machine Learning in Translation
Peng Wang ... David B Sawyer
-
Peng Wang, et. al.Peng Wang ... David B Sawyer
09 Feb 2023
09 Feb 2023

Combining concept maps and interviews to produce representations of personal professional theories in higher vocational education: effects of order and vocational domain
Antoine C M Van Den Bogaart ... Harmen Schaap
Instructional Science | VOL. 45
Antoine C M Van Den Bogaart, et. al.Antoine C M Van Den Bogaart ... Harmen Schaap
10 Feb 2017
Instructional Science | VOL. 45

REVIEW OF VECTOR EMBEDDINGS FUSION METHODS
R V Shaptala
Telecommunication and Information Technologies | VOL. 77
R V ShaptalaR V Shaptala
01 Jan 2021
Telecommunication and Information Technologies | VOL. 77

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning on Wikipedia Text for the Automatic Identification of Vocational Domains of Significance for Displaced Communities

Abstract

Talk to us

Similar Papers