Abstract

Background/Objectives: As the use of big data increases in various fields, the use of social big data analysis for social media is increasing rapidly.This study proposed a method to apply text clustering for analysis by related topics of texts extracted using text mining of social big data.Methods/Statistical analysis: R was used for data collection and analysis, and social big data was collected from Twitter. The clustering model applicable to the related subject analysis of Twitter text was compared and selected and text clustering was performed. Text clustering is analyzed through a cluster dendrogram by generating a corpus, then grouping similar entities from the term-document matrix, and removing the sparse words.Findings: In this study, text clustering improves the difficulty in analyzing by word association and subject in text mining methods such as word cloud. Especially, in the text clustering model for the related topic analysis of social big data, the hierarchical clustering model based on the cosine similarity was more suitable than the non-hierarchical model for identifying which terms in the tweet have an association with each other. In addition, cluster dendrogram has been found to be effective in analyzing text contexts by grouping several groups of similar texts repeatedly in the visualization process.Improvements/Applications: This study can be used to confirm ideas and opinions of various participants by using Social Big Data, and to analyze more precisely the complex relationship between the prediction of social problems and the phenomenon.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call