Abstract

AbstractHarnessing spontaneous contributions of citizens on Social Media and networking sites is a major feature of the next generation citizen-led e-Participation paradigm. However, extracting information of interest from Social Media streams is a challenging task and requires support from domain specific language resources such as lexica. This work describes our efforts at developing a Knowledge Extraction and Management component which employs a lexicon for extracting information related to public services in Social Media contents or streams as part of a holistic technology infrastructure for citizen-led e-Participation. Our approach consists of three basic steps – (1) acquisition and refinement of public service catalogues, (2) organization of the public service names into a lexicon based on different semantic similarity measures and (3) development of a dictionary-based Named Entity Recognizer (NER) or “spotter” based on the lexicon. We evaluate the performance of the NER solution supported by contextual information generated by two well-known general-purpose information NER tools (DBpedia Spotlight and Alchemy) on a dataset of tweets. Results show that our strategy to domain specific information extraction from Social Media is effective. We conclude with a scenario on how our approach could be scaled-up to extract other types of information from citizen discussions on Social Media.Keywordse-ParticipationCitizen-led e-ParticipationInformation extraction (IE)Natural Language Processing (NLP)Public servicese-Government

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.