Abstract
Chinese address element segmentation is a basic and key step in geocoding technology, and the segmentation results directly affect the accuracy and certainty of geocoding. However, due to the lack of obvious word boundaries in Chinese text, the grammatical and semantic features of Chinese text are complicated. Coupled with the diversity and complexity in Chinese address expressions, the segmentation of Chinese address elements is a substantial challenge. Therefore, this paper proposes a method of Chinese address element segmentation based on a bidirectional gated recurrent unit (Bi-GRU) neural network. This method uses the Bi-GRU neural network to generate tag features based on Chinese word segmentation and then uses the Viterbi algorithm to perform tag inference to achieve the segmentation of Chinese address elements. The neural network model is trained and verified based on the point of interest (POI) address data and partial directory data from the Baidu map of Beijing. The results show that the method is superior to previous neural network models in terms of segmentation performance and efficiency.
Highlights
With the rapid development of technologies such as the internet and big data and the emergence of location-based services [1], the public’s demand for location data is increasing rapidly
The results of the segmentation of the address elements and the training times of the bidirectional gated recurrent unit (Bi-gated recurrent unit (GRU)), Bi-long short-term memory (LSTM), GRU, and LSTM neural networks for different inputs are shown in Tables 7 and 8
The bold font in the first row indicates the one with a shorter training time for the two bidirectional neural networks (Bi-GRU and Bi-LSTM) in different input models
Summary
With the rapid development of technologies such as the internet and big data and the emergence of location-based services [1], the public’s demand for location data is increasing rapidly. Geocoding has a wide range of applications in the field of urban spatial positioning and spatial analysis, such as disaster emergency response and disaster management [5], disease investigation and prevention [6], and crime scene location [7]. It realizes the space coordinate conversion process of textual addresses through address element segmentation, address standardization, address matching, and space positioning [8].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.