Abstract

The rapid development of social media has generated huge amount of user generated content (UGC), which plays an important role in the information sharing and fast transmission. In recent years, live social media content analyzing and gathering has attracted much research attention. The challenge of content analyzing and gathering is the short/conversitional textual content, heterogeneous microblog content, live social stream with incremental size. Most of the existing methods take textual information as the searching information, but ignore the visual content and the correlation among the heterogeneous data. In this paper, we propose a microblog brand identification framework. This framework includes a offline relevance detection step and a online rectification step. In the first, we train visual/textual content relevant detectors to determine the relevant degree between microblog and the predefined brand. In order to gather potential brand related microblogs as many as possible, we propose a max aggregated strategy to determine brand related degree of microblog. In the second, we construct a microblog similarity graph by annotated microblog, existing classification microblogs and testing microblogs. Then a edge filtering step is adopted in the graph to remove weak relations between microblogs. Finally a graph based regularization model is proposed to filter out the noise microblogs and optimize the classification results. Experimental results are compared with the state-of-art methods to demonstrate the effectiveness of the proposed approach. Further evaluation shows that the performance of proposed method that takes multimedia information has been improved greatly in comparison with the methods using only one information alone.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.