Abstract

The rapid growth of the World Wide Web (WWW) is demanding for an automated assistance for web page classification (WPC) and categorisation. WPC is a supervised learning problem which is a hard topic in the area of data mining. In this work a method for WPC is proposed. The method consists of feature extraction, information learning and classification phrases. In feature extraction, the features are extracted and utilised to extract informative contents of the pages. The information learning makes use of decision tree algorithm to extract the rules from the features calculated. Based on the rules extracted the classification utilises an artificial neural network and group search algorithm with firefly (ANN-GSOFF) algorithm to improve the WPC. The performance of this technique has been evaluated using WebKb dataset based on parameters like precision, recall and accuracy. The overall performance of projected technique is 67.42% recall, 74.77% accuracy and 83.12% precision values.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call