Abstract

The automatic question answering system based on semantic similarity calculation includes three modules: word segmentation module, question understanding module and FAQ database module. Jieba, an open-source tool, is used in the word segmentation module. The problem understanding module can be further divided into problem classification, keyword extraction, and keyword expansion. The hierarchical classification method based on self-learning rules is used for problem classification. The common question database module distinguishes sentence similarity calculation and question matching. Sentence similarity calculation is based on the How Net semantic dictionary. The core algorithm is the rule algorithm design based on the corpus. The system relies heavily on each module, so it is difficult to establish a more perfect test scheme. Therefore, we only test the sentence similarity calculation which ultimately determines the accuracy of the problem matching, and finally realize the function of each module, and test and evaluate each module. The test results can be summarized that the sentence segmentation is relatively short, the part of speech contains less, and the similarity judgment is relatively concentrated, which is caused by the absence of specified parts of speech in both sentences. According to the part of speech coverage specified by the system, the more comprehensive the coverage, the more accurate the similarity calculation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.