Abstract
The comments contained on e-commerce users generally contain opinions about positive or negative experiences at several online shops. Sentences that can be written indirectly both a little or a lot, will affect other potential customers. So as a result of these comments cause a product sold at an online store has a rating of two things namely "recommended" or "non-recommended". However, detection of positive and negative opinions manually will require more time because of the large amount of data. For this reason opinion mining using technology in data mining can be used to automate positive and negative detection of comments. However, one of the main problems in opinion mining is limited data but has a large number of attributes. In this study, we propose the application of Pearson correlation (PC) based feature selection for opinion mining optimization. The results of the experiment show that the application of PC increases the performance of opinion mining systems in 3 types of classification, namely Logistic Regression, Naïve Bayes and Support Vector Machine, resulting in more optimal accuracy, namely 98.80%, 87.87% and 98.12%.
Highlights
The comments contained on e-commerce users generally contain opinions about positive or negative experiences at several online shops
In this study, we propose the application of Pearson correlation (PC) based feature selection for opinion mining optimization
Metode Naïve Bayes dengan Ensemble Feature dan Seleksi Fitur http//poseidon.csd.auth.gr%0Ahttp://clopinet.com/isabelle/Projec
Summary
Tidak banyak algoritma yang dikhususkan untuk stemming bahasa Indonesia dengan. Algoritma Porter merupakan hal yang penting untuk tahap selanjutnya, misalnya, algoritma ini membutuhkan waktu yang yaitu mengurangi atribut yang kurang berpengaruh relatif lebih singkat dibandingkan dengan stemming terhadap proses klasifikasi data yang dimasukan pada. Menggunakan algoritma Nazief dan Adriani, namun menyatakan jumlah atribut independent, sedangkan proses stemming menggunakan algoritma Porter untuk simbol j menyatakan jumlah record dalam memiliki persentase keakuratan lebih kecil dataset. Dibandingkan dengan stemming menggunakan algoritma Nazief dan Adriani. Algoritma Nazief dan Adriani sebagai algoritma stemming untuk teks berbahasa Indonesia yang memiliki kemampuan persentase keakuratan lebih baik dari algoritma lainnya. Metode Naive Bayes merupakan salah satu metode machine learning yang menggunakan perhitungan probabilitas. Konsep dasar yang digunakan oleh Bayes adalah Teorema Bayes, yaitu melakukan klasifikasi dengan melakukan perhitungan nilai probabilitas
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.