Abstract

Address matching is a crucial step in geocoding; however, this step forms a bottleneck for geocoding accuracy, as precise input is the biggest challenge for establishing perfect matches. Matches still have to be established despite the inevitability of incorrect address inputs such as misspellings, abbreviations, informal and non-standard names, slangs, or coded terms. Thus, this study suggests an address geocoding system using machine learning to enhance the address matching implemented on street-based addresses. Three different kinds of machine learning methods are tested to find the best method showing the highest accuracy. The performance of address matching using machine learning models is compared to multiple text similarity metrics, which are generally used for the word matching. It was proved that extreme gradient boosting with the optimal hyper-parameters was the best machine learning method with the highest accuracy in the address matching process, and the accuracy of extreme gradient boosting outperformed similarity metrics when using training data or input data. The address matching process using machine learning achieved high accuracy and can be applied to any geocoding systems to precisely convert addresses into geographic coordinates for various research and applications, including car navigation.

Highlights

  • Addresses are one of several methods that people perceive location as a textual natural language description

  • We evaluate the suggested address matching, especially the performance of three machine learning models in the address matching process by comparing to multiple text similarity metrics, which are present in previous studies for text matching [18]

  • Whereas the accuracy of address matching with or without similarity metrics is relatively low to obtain meaningful units of addresses to send them to the matching process

Read more

Summary

Introduction

Addresses are one of several methods that people perceive location as a textual natural language description. In urban areas, they are used to communicate and reference a spatial location through direct and indirect methods [2,3] Since these addresses serve as a link to locate demographic, social, economic, or environmental attributes, Geographic Information System (GIS) proves to be a useful tool across application domains. Geocoding supports an expanded view of addresses, including not just structured hierarchical definitions of locations, and building names, postal codes, and telephone area codes [3] These days, geocoding has been widely used. We evaluate the suggested address matching, especially the performance of three machine learning models in the address matching process by comparing to multiple text similarity metrics, which are present in previous studies for text matching [18].

General Steps for Geocoding
Geocoding Algorithm Using Machine Learning Techniques
Address Parsing
3: Algorithm
Address Matching Using Machine Learning
2: Output
Random
Address Locating
Locating thecoordinates coordinates of of the
Experimental Evaluation
As the of correct increases from
Experimental Case 3
Method
99 XGB have the highest hyper-parameter values
10. Accuracy
Comparison of the Results of Three Experimental Cases
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call