Abstract

Technological developments make the distribution of the amount of data more and more and continue to grow every day, these developments can be used to mine data which can later be processed into text/information needed for its use. Preprocessing is part of text mining where the process is divided into several stages, namely case folding, symbol removal, slangword conversion, stopword removal, stemming and tokenization. The news obtained is raw data from the xlm file from google alert which is then inputted into a system developed using the PHP programming language and mysql database. The data processing method in this research is Electronic Data Processing. The use of this system is expected to help the data preprocessing process where the process takes a long time, especially if a large sample of data is needed. The results of the study showed that a crawling process data processing information system for 20 data records only takes 0.0079004486401876 Mins and the data cleaning process or preprocessing for 88 data records only takes 0.012900729974111 Mins. In other words, data processing using the system is more effective and efficient for the next process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call