Construction of a Semi-Automated model for FAQ Retrieval via Short Message Service

Amit Agarwal,Ankush Mittal,Bhumika Gupta,Gaurav Bhatt

doi:10.1145/2838706.2838717

Abstract

Mobile phones, currently, are one of the most extensive medium for the communication of any kind of information to the general public. Being one of the fastest spreading technologies, even to the remotest of areas, this highly sought after contemporary resource has started seeking its application in areas like healthcare, education, banking and internet crime. On this account Short Message Service via mobile phones can aid as an efficient tool to retrieve answers to various Frequently Asked Questions (FAQs) in multiple domains. This application of text messages using mobile phones can be quite substantial only if the limitations that occur due to the large amount of noise in the SMS text can be eliminated. The solution proposed in this paper tries to effectively denoise the text using a similarity measure that aggregates results from prefix and suffix matching and a similarity ratio. To further refine these results supervised machine learning using Naive Bayes theorem on the N-Gram Markov model is implemented. For this we use the training database of FAQs in various domains to compute probabilities of consecutive occurrence of bigrams of words. Further, using set operations like intersection and minus the corrected query is matched in the FAQ corpus to generate the most proximate questions corresponding to it. To demonstrate the accuracy of the proposed algorithm it was experimented upon a set of queries collected from some mobile phone users and the results were compared with that of certain existing methodologies.

Full Text