Supervised two-step feature extraction for structured representation of text data

Ondřej Háva,Miroslav Skrbek,Pavel Kordík

doi:10.1016/j.simpat.2012.11.003

Supervised two-step feature extraction for structured representation of text data

Ondřej Háva, Miroslav Skrbek + Show 1 more

https://doi.org/10.1016/j.simpat.2012.11.003

Copy DOI

Journal: Simulation Modelling Practice and Theory	Publication Date: Dec 31, 2012
Citations: 6

Affiliation: Czech Technical University in Prague

#Training Data Matrix #Large Number Of Dimensions + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Training data matrix used for classification of text documents to multiple categories is characterized by large number of dimensions while the number of manually classified training documents is relatively small. Thus the suitable dimensionality reduction techniques are required to be able to develop the classifier. The article describes two-step supervised feature extraction method that takes advantage of projections of terms into document and category spaces. We propose several enhancements that make the method more efficient and faster than it was presented in our former paper. We also introduce the adjustment score that enables to correct defected targets or helps to identify improper training examples that bias extracted features.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Simulation Modelling Practice and Theory

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.