Abstract

The ubiquity of the Internet has brought about an increasing amount of multi-formatted Web documents. Although image occupies a large part of importance on these increasing Web documents, there have not been many researches for analyzing and understanding it. Many Web images are used for carrying important information but others are not used for it. If images in a Web document can be classified by which have particular information or not, then it would be very useful for analysis and multi-formatting of Web documents. In this paper we introduce the machine learning based methods of classifying Web images as either eliminable or non-eliminable. For this research, we have detected 16 special and rich features for Web images and experimented by using the Bayesian and decision tree methods. As the results, F-measures of 87.09%, 82.72% were achieved for each method and particularly, from the experiments to compare the effects of feature groups, it has proved that the selected features on this study are very useful for Web image classification.© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.