Abstract

Document clustering plays an important role in text mining and machine learning. Recently, various studies have indicated that a hybrid representation contributes to document clustering tasks. Adaptively adjusting representation learning to make each representation of documents mutually reinforced remains an open problem. In this study, we propose a deep document clustering model, DCAHR, to improve the document clustering performance by adaptively learning a hybrid representation. Specifically, an adaptive representation enhanced network (AREN) is designed using a composite Gated Linear Units (GLU) mechanism to adaptively capture the consistent information of semantic and structure representations, and utilize it to adaptively adjust each representation learning for studying the corresponding enhanced representations. A hybrid representation was then obtained by normalizing the two enhanced representations. Furthermore, an adaptive joint objective function was developed to learn document partitioning and supervise the entire model by adaptively learning the weights of each objective function. Experimental results on nine commonly used real-world datasets demonstrated the outstanding performance of the proposed DCAHR model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call