Abstract

High-tech industry employees are among the most talented groups of people in the workforce, and are therefore difficult to recruit and retain. We analyze employee reviews submitted by employees from five technology companies. Following the Cross-Industry Standard Process for Data Mining (CRISP-DM) and the data science life cycle process, we use machine learning techniques to analyze employees’ reviews. Our goal is to predict an overall measure of whether employees are satisfied or not, using other information from the reviews, such as employer attitudes towards upper management. We also use predictive analysis to determine which features are more helpful in determining an employee’s overall job satisfaction. Finally, we analyze which prediction algorithm provides the most accurate predictions. We find the percentage of true positives we correctly identify in the holdout sample is 97.4%, while the percentage of true negatives correctly identified is 72.5%.

Highlights

  • The AUC is the area under the (ROC) curve, where the ROC curve is the curve that illustrates the tradeoff between false positives and true positives, for different values of the cutoff threshold

  • A value of the AUC close to one means that one can identify a large percentage of the true positives, while suffering only a small number of false positives

  • It indicates the overall number of false negatives (FN), true negatives (TN), false positives (FP), and true positives (TP) in the holdout sample

Read more

Summary

Introduction

Datasets from open sources tend to be anonymous, which allows employees to have more freedom to share their real feelings about their places of work, without the risk of losing their jobs. True in-depth information is available, and since the reviews are typically anonymous, they often contain a voluminous amount of information These datasets allow researchers to do a deep analysis of employee opinions, using various techniques. Reviews are submitted by various groups of employees, which vary by type of job (e.g., programmers or managers), status (e.g., current or former employees), rank, and anonymous versus nonanonymous, etc Such datasets allow more in-depth analysis of what employees think about the companies they work for, which can help companies improve the recruiting and retention of these employees. We use machine learning techniques to analyze tech industry employees’ reviews to find out whether or not employees’ overall job satisfaction (as measured by star ratings) can be predicted, based on other data available in their reviews, such as star ratings of senior management, as well as written comments

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call