Abstract
Chinese Address Segmentation (CAS) is a crucial step that can greatly enhance the performance, accuracy, and reliability of geo-coding technology. However, it presents a tremendous challenge due to the inherent lack of obvious word boundaries, complex grammatical and semantic features. To address this challenge, we propose a novel CAS method or model that starts from scratch, without relying on any pre-installed knowledge about Chinese addresses. Instead, it dynamically evolves and grows its knowledge library by leveraging contextual information and comparing addresses during the process of dividing them into address elements. Our approach does not rely on Chinese language or address-element dictionaries, nor does it depend on address statistics. The knowledge library is automatically extracted and organized in a tree data structure. This unique approach allows our method to effectively segment addresses from any area of China, including regions with intricate address expressions, such as the Inner Mongolia Autonomous Region. Experimental results demonstrate that our method achieves high precision in address segmentation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.