Abstract

The number of visitors to Lombok, one of the famous tourist destinations in Indonesia, increased from 400,595 in 2020 to 1,376,295 in 2022. Although the government supports the hotel industry, fake reviews are a significant problem that can damage hotel reputations and mislead tourists. This study uses BERT and Sublinear_TF feature extraction techniques to analyze fake reviews from three main areas: Gili Trawangan, Senggigi, and Kuta. BERT detects fake reviews by understanding the context of words, while Sublinear_TF emphasizes more informative words by reducing the weight of irrelevant common words. The results showed that the more extensive and diverse dataset from Gili Trawangan had the best classification results. The combination of BERT and Random Forest achieved the highest accuracy of 0.84. Overall, BERT excels in Gili Trawangan with an accuracy of 0.79 for SVM and 0.84 for Random Forest. In contrast, smaller and more homogeneous datasets such as Senggigi and Kuta have lower accuracy. In addition, Sublinear_TF performed well on Gili Trawangan with an accuracy of 0.82 using SVM and 0.83 using Random Forest; however, its performance declined in Senggigi and Kuta. BERT and Sublinear_TF techniques are more effective on large and diverse datasets such as Gili Trawangan. Sublinear_TF is better for varied data but less effective on more homogeneous datasets, while BERT with Random Forest showed the highest accuracy due to its ability to capture broader language context. This suggests that the size and variety of the dataset highly influence the success of fake review classification techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.