Abstract

Data mining involves the use of data analysis tools to discover previously unknown, valid patterns and relationships in large data sets. As the use of internet is increasing day by day and with the advancement of internet news also publish online. So to handle this bulk amount of news various data mining techniques for classification had been used. In this paper we are using an intelligent system based on Hybrid algorithm (HMM, SVM and CART) for e-news classification. An intelligent system is designed which will extract the online news and then will find out category and subcategory wise news. System involves four main stages: a) Keyword Extraction b) Implementation of Hybrid Algorithm (HMM, SVM and CART). Data have been collected for experimentation from online newspapers like The Hindu, Hindustan Times and Times of India. The experimental results are based on the news categories and sub categories such as Entertainment: Bollywood 100% and Hollywood 90%, Sports: Cricket 90%, Football 90% and Hockey 78%, Matrimonial :Hindu 100% and Muslim 80%. In this paper we also compare the result of Hybrid algorithm (HMM, SVM and CART) with individual HMM and SVM Algorithm and conclude that Hybrid algorithm (HMM, SVM and CART) gave better result than that of what HMM and SVM individually gave.

Highlights

  • Data mining known as knowledge discovery, which is computer-aided process of identifying hidden patterns by digging and analyzing enormous sets of data and extracting the meaning of the data

  • For the purpose of developing text classification system many researchers devoted their time for developing automated text classification .In early days the work of classification and indexing of online news was totally manual where rule base was generated by human expertise

  • We use Hybrid algorithm (HMM, SVM and CART) which is automated intelligent system that will conclude the result without taking much time, with less effort and with high accuracy rate

Read more

Summary

INTRODUCTION

Data mining known as knowledge discovery, which is computer-aided process of identifying hidden patterns by digging and analyzing enormous sets of data and extracting the meaning of the data. In this paper we are considering online news for classification with hybrid algorithm (HMM, SVM and CART). For the purpose of developing text classification system many researchers devoted their time for developing automated text classification .In early days the work of classification and indexing of online news was totally manual where rule base was generated by human expertise. It was very time consuming process with less accuracy and more effort. We use Hybrid algorithm (HMM, SVM and CART) which is automated intelligent system that will conclude the result without taking much time, with less effort and with high accuracy rate

RELATED WORK
Result
Create Text File
Creation of Knowledge base
Hidden Markov Model
Support Vector Machine
Evaluation
Experimental Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call