Abstract

The Internet contains potentially harmful or inappropriate content in web pages that parents may not wish their children to access. A possible solution is an internet content filter, to block prohibitive content. This paper proposes a system called SMART SAINT, which can deliver high accuracy classifiers using a welcome combination of active semi-supervised learning and feature selection. The system proposes a combination of co-testing as an active semi-supervised learning method and Binormal Separation (BNS) as a feature selection method. We empirically evaluate the core implementation of the system on a real world dataset of approximately 10,000 web pages and results indicate that the combination is highly effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call