Abstract

Social network events are related to the national economy and people’s livelihood, so timely perception and processing of massive social network events data are becoming increasingly important for public opinion analysis. How to fully exploit the access feature of event information to store and manage social network events is of great significance for accurate and real-time query analysis. We propose a social network event storage management method based on microblog text clustering and hot/cold data classification. First, for the microblog text data, we construct a keyword provenance graph by using the information entropy to measure the weight of the edge between keyword nodes. Then, we cluster the events using the provenance-based community partition (PCP) with local modularity to improve the event clustering accuracy. In addition, we can further filter noisy data via incremental clustering, enable hot/cold event data classification and dynamic migration, and compress cold data to save space on a hybrid storage architecture. The experimental results show that the clustering purity can reach more than 93% and the query time can be reduced by more than 70% using clustering and hybrid storage policy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.