Abstract

Data scientists are among the highest-paid and most in-demand employees in the 21st century. This gives them opportunities to switch jobs quite easily. In this paper, we follow the Cross-Industry Standard Process for Data Mining (CRISP-DM) approach and the data science life cycle process to analyze factors which predict whether a data scientist is looking for a new job or not. Specifically, we use machine learning techniques to analyze data from Kaggle.com. We find that features that have the highest impact on whether a data scientist wants to change his/her job include the city development index, company size, and company type. When we examine the city development index more carefully, we find evidence suggesting that employees move from cities with lower to higher development indexes, as they become more experienced. The predictive analysis system we use is able to predict with average accuracy rates of higher than 78%.

Highlights

  • According to IDC Data Age 2025, the total amount of digital data created worldwide will rise from 163 zettabytes to 175 zettabytes in 2025 (Rethinking Data, 2020)

  • In this research, we follow the Cross-Industry Standard Process for Data Mining (CRISP-DM) approach and the data science life cycle process to analyze factors which predict whether a data scientist is looking for a new job or not

  • We ask the following research questions: RQ 1: Using machine learning techniques to analyze the data scientist dataset, what features have the highest individual correlations with data scientists wanting to look for new jobs?

Read more

Summary

Introduction

According to IDC Data Age 2025, the total amount of digital data created worldwide will rise from 163 zettabytes to 175 zettabytes in 2025 (Rethinking Data, 2020) This will create great opportunities for organizations to analyze these large amounts of data, in structured and unstructured formats, to help in business planning and decision making. In this research, we follow the Cross-Industry Standard Process for Data Mining (CRISP-DM) approach and the data science life cycle process to analyze factors which predict whether a data scientist is looking for a new job or not. We ask the following research questions: RQ 1: Using machine learning techniques to analyze the data scientist dataset, what features have the highest individual correlations with data scientists wanting to look for new jobs?.

Related Work
Results and System
DISCUSSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.