Abstract

Stable clustering is a very desirable property for the dynamical knowledge base consisting of cases indexed through the domain ontology both because the cases are constantly added to the knowledge base, and the ontology structure can be revised from time to time. An approach has been developed for the stable hierarchical clustering of cases based on their semantic integration with the domain ontology. Two variants of dimension reduction are compared: principal component extraction and unsupervised feature selection. As a criterion of stability, it is proposed to use the maximum eigenvalue of the matrix of cophenetic correlations. Studies are carried out on the basis of resampling with continuous weights drawn from lognormal distribution. As a result, it was revealed that the clustering stability decreases with increasing the number of principal components. The optimal number of components can be chosen on the basis of a trade-off between the stability and the percentage of the explained variance. By unsupervised selection the subset of concepts is significantly reduced. Based on the practice of IT consulting, dendrograms of cases on the initial semantic data matrix and on the principal components were constructed and interpreted.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.