Abstract

Hierarchical Dirichlet process (HDP) is an unsupervised method which has been widely used for topic extraction and document clustering problems. One advantage of HDP is that it has an inherent mechanism to determine the total number of clusters/topics. However, HDP has three weaknesses: (1) there is no mechanism to use known labels or incorporate expert knowledge into the learning procedure, thus precluding users from directing the learning and making the final results incomprehensible; (2) it cannot detect the categories expected by applications without expert guidance; (3) it does not automatically adjust the model parameters and structure in a changing environment. To address these weaknesses, this paper proposes an incremental learning method, with partial supervision for HDP, which enables the topic model (initially guided by partial knowledge) to incrementally adapt to the latest available information. An important contribution of this work is the application of granular computing to HDP for partial-supervision and incremental learning which results in a more controllable and interpretable model structure. These enhancements provide a more flexible approach with expert guidance for the model learning and hence results in better prediction accuracy and interpretability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call