Abstract

Two popular data science analyses are spatial analysis and machine learning. The most popular spatial analysis is geographically weighted regression (GWR), and the most popular of machine learning is random forest (classification dan regression). The paper used GWR and Random Forest Regression (RFR) to find the variables that significantly impact life expectancy and compare two methods using RMSE. Knowledge of life expectancy and determining factors on life expectancy is essential because the increase in life expectancy is linked to a country’s economic and social prosperity. The researcher employed life expectancy data as the dependent variable (Y) and some independent variables. At each observation location in Sumatra Region, the GWR model and significant independent variables are different. The RMSE value in the GWR model is 64.99. There are three variables with significant influence in most areas are X3 (percentage of proper sanitation households), X5 (number of doctors), and X7 (average years of schooling). The RMSE value in the RFR model is 84.04 with three variables importance: X3 (percentage of proper sanitation households), X5 (number of doctors), and X7 (average years of schooling) have the most influence. Keywords: GWR, Life Expectancy, RFR, RMSE.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.