Abstract

In recent years, neural networks (NN) have achieved remarkable performance improvement in text classification due to their powerful ability to encode discriminative features by incorporating label information into model training. Inspired by the success of NN in text classification, we propose a pseudo-supervised neural network approach for text clustering. The neural network is trained in a supervised fashion with pseudo-labels, which are provided by the cluster labels of pre-clustering on unsupervised document representations. To enhance the quality of pseudo-labels, a consensus analysis is employed to select training samples for the neural network. The experimental results demonstrate that the proposed approach can improve the clustering performance significantly.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.