Abstract

Abstract Social media stores a significant amount of information which can be used for extraction of specific knowledge. A variety of topics that arise there concerns a lot of everyday life aspects, including urban-related problems. In this work, we demonstrate the way of using the texts from social media on the topic of housing and utility problems, such as litter on the streets, graffiti on a public building or noisy neighbours. Our aim is to develop an approach based on machine learning to automatically filter such citizen messages and classify them into several categories. To achieve this, we solve the classification problem with an almost unlimited number of negative categories using the One-Class approach and combine data from several resources to construct proper text embedding by combining results from the guided topic model and deep neural pretrained BERT method. Comparison with statistics taken from the official site indicates that the distributions of posts on each problem category are similar for districts of Saint-Petersburg

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.