Abstract

Research in the field of Text Mining in general still uses text in English, Arabic, China or others language, while for text in Indonesian is still very limited, so it requires good tools to help Indonesian researchers to conduct research in the field of text mining in Indonesian. Pre-processing is needed for text mining processes such as deleting notation ‘@’, ‘http’ removal, Indonesian stopwords, normalizing acronym, slang words, emoticons, and Indonesian stemming. The GATA Framework Text Mining provided is one of the options for conducting text mining research in Indonesian and has been used by several researchers. There are several known data mining processing methods, including KKD, CRISP-DM, and SEMMA, all three of which are quite reliable methods. CRISP-DM which consists of; Bussiness Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment is a method that is quite widely used in research in the field of text mining which can be combined with text pre-processing. With so much research in the field of Text Mining in Indonesian, the need for pre-processing in Indonesian is very important. GATA Framework is an option for pre-processing devices that can be combined with Repidminer devices, as seen from the results of the excellent FUPRS.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.