Using Machine Learning to Evaluate Real Estate Prices Using Location Big Data

Walter Coleman,Ben Johann,Natasha Foutz,Nicholas Pasternak,Heman Shakeri,Jaya Vellayan

doi:10.1109/sieds55548.2022.9799393

Walter Coleman, Ben Johann + Show 4 more

Open Access

PDF Available

https://doi.org/10.1109/sieds55548.2022.9799393

Copy DOI

Export

Save

Cite

Publication Date: Apr 28, 2022

Citations: 2

Affiliation: University of Virginia

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g, number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to improve upon past research. So we decided to explore a unique approach - we wanted to determine if mobile location data could be used to improve the predictive power of popular regression and tree-based models. To prepare our data for our models, we processed the mobility data by attaching it to individual properties from the real estate data that aggregated users within 500 meters of the property for each day of the week. We removed people that lived within 500 meters of each property, so each property's aggregated mobility data only contained non-resident census features. On top of these dynamic census features, we also included static census features, including the number of people in the area, the average proportion of people commuting, and the number of residents in the area. Finally, we tested multiple models to predict real estate prices. Our proposed model is two stacked random forest modules combined using a ridge regression that uses the random forest outputs as predictors. The first random forest model used static features only and the second random forest model used dynamic features only. Comparing our models with and without the dynamic mobile location features concludes the model with dynamic mobile location features achieves 3 % lower mean squared error than the same model but without dynamic mobile location features.

Full Text