로지스틱 회귀분석 방법과 랜덤포레스트 방법을 활용한 대학생의 소속 학과 만족도에 대한 영향 요인 분석

Chungwon Ha,Seunghee Lee

doi:10.26734/jfe.2022.12.02.01

Abstract

The purpose of this study is to provide basic research data for college students' career guidance and policy and system establishment related to dropout prevention by analyzing major factors affecting college students' satisfaction with their departments by using machine learning analysis methods. For this purpose, 1,298 four-year college students from the 􍾧Korean Education & Employment Panel 􎟯(KEEP􎟯)' data were analyzed through logistic regression analysis and random forest analysis method, which are machine learning analysis methods. The main analysis results are as follows. First, in the year of college admission, explanatory variables related to high school enrollment period and career plan after high school graduation, in addition to variables related to college life, accounted for a significant proportion of the top 10 items of importance. In the period excluding the year of admission and the year immediately before graduation, variables related to major learning and career activities were important variables. In the year immediately before graduation, activity variables such as job preparation and education and training experience recorded high importance in both logistic regression analysis and random forest analysis results. Second, according to the two analysis methods, the agreement of the top 10 variables by grade level was 63.3%. Third, unlike logistic regression analysis, in random forest analysis, the explanatory variables answered by the survey respondents using multiple scales were included in the top 10 explanatory variables of importance in relatively many cases. This study is significant in that it attempted to compare the results by deriving common factors using two machine learning methods rather than a single analysis method for the educational panel data.

Full Text