Abstract

The real estate market is exposed to many fluctuations in prices because of existing correlations with many variables, some of which cannot be controlled or might even be unknown. Housing prices can increase rapidly (or in some cases, also drop very fast), yet the numerous listings available online where houses are sold or rented are not likely to be updated that often. In some cases, individuals interested in selling a house (or apartment) might include it in some online listing, and forget about updating the price. In other cases, some individuals might be interested in deliberately setting a price below the market price in order to sell the home faster, for various reasons. In this paper, we aim at developing a machine learning application that identifies opportunities in the real estate market in real time, i.e., houses that are listed with a price substantially below the market price. This program can be useful for investors interested in the housing market. We have focused in a use case considering real estate assets located in the Salamanca district in Madrid (Spain) and listed in the most relevant Spanish online site for home sales and rentals. The application is formally implemented as a regression problem that tries to estimate the market price of a house given features retrieved from public online listings. For building this application, we have performed a feature engineering stage in order to discover relevant features that allows for attaining a high predictive performance. Several machine learning algorithms have been tested, including regression trees, k-nearest neighbors, support vector machines and neural networks, identifying advantages and handicaps of each of them.

Highlights

  • The real estate market is rapidly evolving

  • In the end, we have considered the use of four different machine learning techniques, which belong to different categories: kernel models, geometric models, ensembles of rule-based models and neural networks

  • The following quality metrics for regression have been computed, which are provided by the scikit-learn application programming interface (API):

Read more

Summary

Introduction

The real estate market is rapidly evolving. A recent report published by MSCI, Inc. (formerly Morgan Stanley Capital International) estimates the size of the professionally managed real estate investment market in $8.5 trillion in 2017, increasing a total of $1.1 trillion since the previous year [1]. Different market segments evolve at different paces, such as high-end luxury condos An example of such differences can be observed, which shows the evolution of the price (measured in euros per square meter) of resale houses in four different Spanish regions: Barcelona (blue), Madrid (yellow), Palma de Mallorca (red) and Lugo (green). The main force determining the value of houses is demand With such a high variability and unpredictable factors (e.g., one neighborhood deemed better or more fashionable than others), it is likely that the price of some assets will deviate from its expected value. We aim at using machine learning techniques to identify such opportunities, by determining whether the price of an asset is smaller than its estimated value.

State of the Art
Source and Description
Data Cleansing
Exploratory Data Analysis
Machine Learning Proposal
Experimental Setup
Results and Findings
Which Model Performs Best?
How Much Time Do Models Require to Train and Run?
Conclusions
Data Statement
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call