Abstract
The population density in Semarang City is increasing every year. This requires more potential land to build houses to accommodate the denser population. There are various kinds of house prices based on specifications in Semarang City. This requires the right prediction to get the desired house. This study implements and compares the performance of Multiple Linear Regression (MLR) and Random Forest Regression (RFR) models to predict house prices in Semarang City. The method used in this research is CRISP-DM (Cross-Industry Standard Process for Data Mining) as a data mining process. The data used in this research amounted to 9533 data with 8 variables obtained by web scraping. The data will go through a data preprocessing process then training the model. Next is the evaluation stage, which is carried out to measure the performance of the two models using evaluation metrics, namely R-Squared (prediction accuracy), MSE (Mean Squared Error), and RMSE (Root Mean Squared Error). The results of this study show that the MLR model obtained a prediction accuracy 61.1% with a training and testing data division ratio of 75%: 25%. While the RFR model produces a prediction accuracy 78.4% with a training and testing data division ratio of 90%: 10%. This shows that the RFR model is the best performing model. This research successfully applied the RFR model to the streamlit web framework. The final result of this research is a website that can be used by the public to predict house prices according to criteria in Semarang City.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Pattimura Proceeding: Conference of Science and Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.