Abstract
We propose a hard and a fuzzy diagonal co-clustering algorithms built upon the double K-means to address the problem of document-term co-clustering. At each iteration, the proposed algorithms seek a diagonal block structure of the data by minimizing a criterion based on both the variance within the class and the centroid effect. In addition to be easy-to-interpret and effective on sparse binary and continuous data, the proposed algorithms, Hard Diagonal Double K-means (DDKM) and Fuzzy Diagonal Double K-means (F-DDKM), are also faster than other state-of-the-art clustering algorithms. We evaluate our contribution using synthetic data sets, and real data sets commonly used in document clustering.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.