Abstract

Web content mining describes the classification, clustering, and attribute analysis of a large number of text documents and multimedia files on the web. Special tasks include retrieval of data from the Internet search engine tool W; structured processing and analysis of web data. Today's blog analysis has security concerns. We do experiments to investigate its safety. Through experiments, we draw the following conclusions: (1) Web log extraction can use efficient data mining algorithms to systematically extract logs from web servers, then determine the main access types or interests of users, and then to a certain extent, based on the discovered user patterns, analyze the user's access settings and behavior. (2) No matter in the test set or the mixed test set, the curve value of deep mining is very stable, the curve value has been kept at 0.95, and the curve value of fuzzy statistics method and quantitative statistics method is stable within the interval of 0.90-095. The results also show that the data mining method has the highest identification accuracy and the best security performance. (3) Web usage analysis requires data abstraction for pattern discovery. This data abstraction can be achieved through data preprocessing, which introduces different formats of web server log files and how web server log data is preprocessed for web usage analysis. One of the most critical parts of the web mining field is web log mining. Web log mining can use powerful data mining algorithms to systematically mine the logs in the web server and then learn the user's access or preferred interests and then conduct a certain degree of user preferences and behavior patterns according to the discovered user patterns. Based on the above analysis, the current web log analysis is faced with security problems. We conduct experiments to study to verify the security performance of web logs and draw conclusions through experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call