Abstract

Training data matrix used for classification of text documents to multiple categories is characterized by large number of dimensions while the number of manually classified training documents is relatively small. Thus the suitable dimensionality reduction techniques are required to be able to develop the classifier. The article describes two-step supervised feature extraction method that takes advantage of projections of terms into document and category spaces. We propose several enhancements that make the method more efficient and faster than it was presented in our former paper. We also introduce the adjustment score that enables to correct defected targets or helps to identify improper training examples that bias extracted features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.