In the face of problems such as information overload and the information cocoon resulting from big data, it is a key point of current research to solve the problem of semantic fuzziness of online reviews and improve the accuracy of personalized recommendation algorithms by using online reviews. Based on the advantage of the probabilistic language term set to deal with fuzzy information and the historical data of online hotel reviews, this paper proposes a collaborative filtering recommendation algorithm for hotels. Firstly, the text data of hotel online reviews are crawled by a crawler and processed by jieba and TF-IDF tools. Secondly, the hotel evaluation attribute set is constructed, and the sentiment analysis of the review statements is carried out with the help of the HowNet sentiment dictionary and manual annotation method. The probabilistic language term set is used to classify the data and derive statistics, and the maximum deviation method is used to determine the weight of each attribute. Then, the cosine similarity formula is fused with the modified cosine similarity formula to calculate the similarity and construct the decision matrix. Finally, combined with the historical data of the user’s hotel selection, the hotel recommendation results are generated. This paper collected review data from 10 hotels in Macau from the official “Ctrip” website. The proposed recommendation algorithm model was then applied to process and analyze the data, resulting in the generation of a ranked list of hotel recommendations. To validate the accuracy and effectiveness of this research, the recommendation results were compared with those produced by other algorithms.
Read full abstract