Abstract

With the rapid development of the Internet, the information on the Internet presents an explosive growth. Cloud computing and big data analysis technology based on Internet information rise accordingly. However, all web pages contain not only important information but also the noise information irrelevant to the subject information. They seriously affect the accuracy of information extraction, so the research of web page information extraction technology arises at the historic moment and becomes the research hotspot. The quality of web page text information will directly affect the accuracy of later information processing and decision-making. If we can accurately evaluate the information of the web pages captured from the Internet and classify the extracted web pages according to the corresponding characteristics, we can not only improve the efficiency of information processing, but also improve the practical value of the information decision-making system. From the practical application requirements and user-friendly operation point of view, the information visualization of web design based on big data is studied in this paper. Specifically, the system designed in this paper improves the traditional template-based web information extraction method, establishes a web information extraction rule scheme combined with templates, and achieves the goal of web information extraction rule selection and template generation in the visual environment. Finally, the visualization algorithm based on T-SNE verifies the effectiveness of the web page information visualization algorithm designed in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call