Abstract

In this paper, a Chinese address resolution method based on Conditional Random Fields is proposed. In this method, the address resolution is divided into the address segmentation and the address component annotation issues, and moreover, the segmentation is combined with the address component annotation to form the annotated data set of output sequences. Meanwhile, taking a full consideration of the composition and use habits of Chinese address, the helpful characteristics to address resolution is set. Then, the address corpus and the corresponding characteristic template are constructed, and the conditional model which is suitable for Chinese non-standard address is obtained with model training of conditional random fields. In the end, the performance test is conducted on the conditional model of Chinese address on test set gained through training with experiments. Results show that the annotation accuracy of conditional random fields conforms to the requirements of address matching basically, and the accuracy is over 80%, with a certain practical value.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call