Abstract

In Indonesia, many local websites, such as Irama Nusantara, hold valuable information related to music and culture. Although rich in data, the utilization of this information is still limited. This research aims to utilize query expansion techniques through data mining methods in analyzing data from the Irama Nusantara website. Data was collected from the Irama Nusantara website through a crawling process, resulting in 5404 entries covering audio, images and text. The analysis was conducted using Natural Language Processing (NLP) techniques starting with the preprocessing stage. Next, the K-Means algorithm was applied for clustering, and the Term Frequency-Inverse Document Frequency (TF-IDF) method was used for term weighting. Classification models were built using Support Vector Machine (SVM) and Naive Bayes for comparison. The analysis shows that the use of query expansion significantly improves the accuracy of information retrieval on the Irama Nusantara website. The method evaluation showed that SVM gave better results in terms of accuracy and precision compared to Naive Bayes. In addition, Principal Component Analysis (PCA) shows that 70-95% of the variance in the data can be explained by the resulting principal components, which signifies the efficiency of the applied method. This research not only provides a deeper insight into the patterns and trends in the analyzed data, but also contributes to the development of information technology in the field of culture in Indonesia. This research successfully developed an effective analysis model to utilize data from the Irama Nusantara website.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.