Abstract

With the amount of online information continuously growing, it becomes more and more important for online stores to recommend corresponding products precisely based on users' preferences. Reviews for various products can be of great help for the recommendation task. However, most recommendation platforms only classify positive and negative reviews based on sentiment analysis, without considering the actual demands of users, and it will reduce the effectiveness on classification task. To count this issue, we propose a new model, which integrates heterogeneous neural network and text pretraining model into this task, and compare this model with others on a travel type classification task. The model combines a pretrained text model named Bidirectional Encoder Representation from Transformers (BERT) and heterogeneous graph attention network (HGAN). Firstly, we do a fine-tuning task on BERT by a dataset consisting of 1.4 million hotel reviews from the Ctrip website to obtain fine representations of trip-related words. Then, we proposed the similarity fussy-matching method to get the main topic of each review. Then, we construct a heterogeneous neural network and apply the attention mechanism to it to mine the preference of users for traveling. Finally, the classification task is done based on each user's preference. In Section 5 of this study, we do an experiment, which compares our model with five others. The results show that the accuracy of ours is 70%, which is higher than the other five models.

Highlights

  • Online reviews are becoming important references for customers to obtain information and make decisions

  • Researches about online review classification focus on text sentiment analysis [3], topic classification, and review usefulness analysis [4–7]. Most studies apply this technique to the field of hotel management [8–10], but rarely mine users’ preferences according to contents of reviews and make review classification according to the actual demands of users

  • The main work of this study is as follows: (1) A hotel review corpus is established. e heterogeneous information network was constructed with travel type, review text, and topic words as nodes, and the Bidirectional Encoder Representation from Transformers (BERT)-heterogeneous graph attention network (HGAN) model was constructed by combining BERT and heterogeneous information network methods

Read more

Summary

Introduction

Online reviews are becoming important references for customers to obtain information and make decisions. Researches about online review classification focus on text sentiment analysis [3], topic classification, and review usefulness analysis [4–7] Most studies apply this technique to the field of hotel management [8–10], but rarely mine users’ preferences according to contents of reviews and make review classification according to the actual demands of users. We solved the problem that is labeling massive reviews by getting well-learned word representations through pretrained model, which helps us to avoid the amounts of time on labeling reviews of a large-scale dataset [11, 12]. E heterogeneous information network was constructed with travel type, review text, and topic words as nodes, and the Bidirectional Encoder Representation from Transformers (BERT)-heterogeneous graph attention network (HGAN) model was constructed by combining BERT and heterogeneous information network methods. (2) e hotel review content is predefined into seven categories of topics: location, catering, service, room, price, sanitation, and facilities. e fuzzy matching principle is proposed to identify the review topics and build edges of heterogeneous networks. e graph convolutional network (GCN) is adopted to complete the feature mapping of different nodes, and combined with the attention mechanism, the attention of different review texts to topic words and the attention of users of different travel types to different topic words are calculated from two perspectives, so as to obtain the user preference characteristics. en, Chinese hotel reviews are classified according to user preferences of different travel types

Related Work
The Establishment of the BERT-HGAN Model
Dataset Analysis
Fuzzy Matching of Review-Related Topic Words
Establishment of the Heterogeneous Graph Neural Network
Heterogeneous
User Preference Characteristic Extraction Based on the
Experiment and Result Analysis
Analysis of User Preferences of Different Travel Types
Findings
Analysis of Classification Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call