Abstract

Abstract The hyperlinks on the Sina Weibo platform are tracked by a web crawler so as to obtain the textual resources of university English online phrases on the platform. To process microblog text data, this paper proposes the use of multiple linguistic expressions of translation machines to enhance the text feature representation and apply it to clustering to enhance the clustering results. Based on commonly used algorithms such as K-means, DBSCAN, and hierarchical clustering, we analyze the clustering effect of each algorithm according to internal and external evaluation indexes and finally determine the clustering algorithms to be used for the change of university English network terms. Using the English network word “hand” as an example, this paper observes a fluctuating trend in the heat of this word from January to June 2023, with peaks in heat generation in January and February, respectively, and a peak heat of about 10,500 in February. In terms of spatial variation, the heat level of the English word “hand” is high in the east and south regions and low in the west and north regions, but in February, due to the return of the population to their hometowns for the Lunar New Year, the north region exceeds the east and south regions with a heat level of 11086.77.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.