광역시와 광역도 거주자의 우울경험 여부에 대한 분류 분석

Dong Su Lee

doi:10.48033/jss.8.4.28

Abstract

This study attempted to determine the importance of variables that affect whether personal factors and health levels between metropolitan cities and provinces experience depression. The data for this study was used from the Korea Disease Control and Prevention Agency's 2021 Community Health Survey. 4,602 data from metropolitan cities were used, and 19,545 data from metropolitan provinces were used. The big data used in data analysis was subjected to frequency analysis, T-test, variance analysis, and Random Forest analysis, a machine learning technique, using R 4.3.0 for Windows. As a result of the study, there was no problem of overfitting between train and test data, and the classification model of the machine learning technique was found to be at about 94%. As a result of the analysis, the importance of experiencing depression was different between large cities and local cities. It is believed that more efficient policy establishment will be possible by approaching the causes of depression experienced by citizens in the two regions differently.

Full Text