Abstract

To improve the Internet public opinion monitoring for colleges, this paper designs an efficient Internet public opinion system for college students. The system uses distributed web crawler to collect the structuring information from the news websites, social networking sites, BBS and blogs, which are widely visited by college students. To speed up the text clustering process, the Single-Pass algorithm is parallelized with Spark. The parallel Single-Pass algorithm is used to find hot topics from collected texts. The algorithm improved the clustering efficiency significantly through distributed processing. Simulation results reveal that the proposed system can achieve timely and accurately public opinion monitoring.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call