Abstract
In this paper we propose a strategy to use messages posted in a blogging platform for real-time sensing of traffic-related information. Specifically, we use the data that appear in a blog, in Portuguese language, which is managed by a Brazilian daily newspaper on its online edition. We propose a framework based on two modules to infer the location and traffic condition from unstructured, non georeferenced short post in Portuguese. The first module relates to name-entity recognition (NER). It automatically recognizes three classes of named-entities (NEs) from the input post (LOCATION, STATUS and DATE). Here, a bootstrapping approach is used to expand the initially given list of locations, identifying new locations as well as locations corresponding to spelling variants and typographical errors of the known locations. The second module relates to relation extraction (RE). It extracts binary and ternary relations between such entities to obtain relevant traffic information. In our experiments, the NER module has yielded a F-measure of 96%, while the RE module resulted in 87%. Also, results show that our bootstrapping approach identifies 1;058 new locations when 10;000 short posts are analyzed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.