Abstract

The quality of discovered related features in text documents are describing based on user preferences. For the reason that of large scale terms and data patterns. Most existing popular text mining and classification methods have adopted term-based approaches. Most of the problems are occurred in polysemy and synonmy. Over the years, there has been repeatedly held the hypothesis that pattern-based methods should achieve better than term-based ones. Big challenge is how to effectively use large scale patterns vestiges a hard problem in text mining. In this paper, the robustness is used to discuss the characteristics of a model for describing its training sets is distorted or the application environment is altered. A new model robust if it still provides satisfactory performance regardless of having its training sets are altered or changed. To make a breakthrough in this challenging issue, this paper presents a pioneering model for weight feature discovery. It discovers both positive and negative patterns in text documents as at a higher level features and deploy them over low-level features. The terms also classify into categories and updates term weights depends on their specificity and their distributions in patterns. Significant experiments using this model on RCV1, TREC topics and Reuters-21578 significant experiments using this model on RCV1, TREC topics and Reuters-21578 demonstrate that the proposed model significantly outperforms both the state of the term-based methods and the pattern based methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.