Data Mining Techniques for Analysing Employment Data

Anatoli Nachev

doi:10.35940/ijeat.b3311.129219

Anatoli Nachev

Open Access

https://doi.org/10.35940/ijeat.b3311.129219

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

This paper proposes a methodology that uses a large-scale employment dataset in order to explore which factors affect employment and how. The proposed methodology is a combination of predictive modelling, variable significance analysis, and VEC analysis. Modelling is based on logistic regression, linear discriminant analysis, neural network, classification tree, and support vector machine. Following the CRISP-DM standard process model, we train binary classifiers optimising their hyper-parameters and measure their performance by prediction accuracy, ROC analysis, and AUC. Using sensitivity analysis, we rank the variable significance in order to identify and measure factors of employment. Using VEC analysis, we further explore how values of those factors affect employment. Findings show that best performing models are neural networks and support vector machines with preference to the latter for quality of VEC. Experiments also suggest that education and age are primary contributors for correct classification with specific value distribution, discussed in the paper. All results were validated using a rigorous testing procedure that involves training, validation, and test data partitions and a combination of multiple runs along with three-fold cross-validation. This study addresses some gaps in previous research publications, which lack quantification of the conclusions made.

Highlights

In recent years, analysing large or big-data sources has become focus to many studies related to data mining and knowledge discovery
Modelling Techniques With reference to the CRISP-DM modelling stage, this study considers five binary classification algorithms: Logistic regression, linear discriminant analysis, neural networks, classification trees, and support vector machines, each outlined below briefly
We address some gaps in previous research, which lacks quantification of conclusions made

Summary

Introduction

In recent years, analysing large or big-data sources has become focus to many studies related to data mining and knowledge discovery. Knowledge obtained discloses relationships between factors associated with employment and recognises their role. The tools and methodologies used in that analysis become a valuable mean for empirical validation of hypotheses and theoretical considerations in that domain. This study aims to analyse data form a large-scale nationwide survey of households in Ireland in order to identify empirically employment factors and to find how their values impact on employment. A major component of this analysis is building machine learning classification models that fit the data. Classification is one of the most prominent and effective supervised learning methods, which allows to explore the role of demographic characteristics, education

Objectives

Methods

Results

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Engineering and Advanced Technology	Publication Date: Dec 30, 2019
Citations: 2	License type: cc-by

R Discovery Prime

Data Mining Techniques for Analysing Employment Data

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: International Journal of Engineering and Advanced Technology

Lead the way for us

Similar Papers

Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests
João Maroco ... Manuela Guerreiro
BMC Research Notes | VOL. 4
João Maroco, et. al.João Maroco ... Manuela Guerreiro
17 Aug 2011
BMC Research Notes | VOL. 4

Identifying chronic disease patients using predictive algorithms in pharmacy administrative claims: an application in rheumatoid arthritis
Ervant J Maksabedian Hernandez ... Jessica Tiu
Journal of Medical Economics | VOL. 24
Ervant J Maksabedian Hernandez, et. al.Ervant J Maksabedian Hernandez ... Jessica Tiu
01 Jan 2020
Journal of Medical Economics | VOL. 24

A disaster-severity assessment DSS comparative analysis
J Tinguaro Rodríguez ... Javier Montero
OR Spectrum | VOL. 33
J Tinguaro Rodríguez, et. al.J Tinguaro Rodríguez ... Javier Montero
06 May 2011
OR Spectrum | VOL. 33

Chapter 14 - Fuzzy-machine learning models for the prediction of fire outbreaks: A comparative analysis
Uduak A Umoh ... Emmanuel E Nyoho
Artificial Intelligence and Machine Learning for EDGE Computing | VOL. -
Uduak A Umoh, et. al.Uduak A Umoh ... Emmanuel E Nyoho
01 Jan 2021
Artificial Intelligence and Machine Learning for EDGE Computing | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Data Mining Techniques for Analysing Employment Data

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: International Journal of Engineering and Advanced Technology