Abstract

Community Question Answering (CQA) services, such as Yahoo! Answers, WikiAnswers, and Quora, have recently proliferated on the Internet. A large number of questions in CQA ask for information about a certain place (e.g., a city). Answering such local questions requires some local knowledge; therefore, it is probably beneficial to treat them differently from global questions for answer retrieval and answerer recommendation etc. In this paper, we address the problem of automatically identifying local questions in CQA through machine learning. The challenge is that manually labelling questions as local or global for training would be costly. Realising that we could find many local questions reliably from a few location-related categories (e.g., “Travel”), we propose to build local/global question classifiers in the framework of PU learning (i.e., learning from positive and unlabelled examples), and thus remove the need of manually labelling questions. In addition to standard text features of questions, we also make use of locality features which are extracted by the geo-parsing tool Yahoo! Placemaker. Our experiments on real-world datasets (collected from Yahoo! Answers and WikiAnswers) show that the probability estimation approach to PU learning outperforms S-EM (spy EM) and Biased-SVM for this task. Furthermore, we demonstrate that the spatial scope of a local question can be inferred accurately even if it does not mention any place name. This is particularly helpful in a mobile environment as users would be able to ask local questions via their GPS-equipped mobile phones without explicitly mentioning their current location and intended search radius.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.