Abstract

In order to find a way to process network data and discover hot news based on user’s interest and topic, through research on hot news discovery algorithm, a double-layer text clustering model based on density clustering strategy and Single-pass strategy is proposed. In view of the huge network data characteristics, DBSCAN algorithm is firstly used to cluster the single-crawled network data into small-scale clusters. Then, the Single-pass strategy is used to perform incremental clustering on the micro-classes to create the topic classes. In the hot news part of the network, the media and the user’s attention to the topic is combined to design a model. The heat quantization formula is obtained. Based on the research of related technologies, a network hot topic detection model is designed and implemented by using web crawler, news discovery and hot news discovery technology. By comparing the two-layer model used in the model with the traditional Single-pass strategy, the feasibility of the two-layer model is verified, and the discovery of network hot news is realized.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.