Categorical Clustering Research Articles

Attribute independence has been taken as a major assumption in the limited research that has been conducted on similarity analysis for categorical data, especially unsupervised learning. However, in real-world data sources, attributes are more or less associated with each other in terms of certain coupling relationships. Accordingly, recent works on attribute dependency aggregation have introduced the co-occurrence of attribute values to explore attribute coupling, but they only present a local picture in analyzing categorical data similarity. This is inadequate for deep analysis, and the computational complexity grows exponentially when the data scale increases. This paper proposes an efficient data-driven similarity learning approach that generates a coupled attribute similarity measure for nominal objects with attribute couplings to capture a global picture of attribute similarity. It involves the frequency-based intra-coupled similarity within an attribute and the inter-coupled similarity upon value co-occurrences between attributes, as well as their integration on the object level. In particular, four measures are designed for the inter-coupled similarity to calculate the similarity between two categorical values by considering their relationships with other attributes in terms of power set, universal set, joint set, and intersection set. The theoretical analysis reveals the equivalent accuracy and superior efficiency of the measure based on the intersection set, particularly for large-scale data sets. Intensive experiments of data structure and clustering algorithms incorporating the coupled dissimilarity metric achieve a significant performance improvement on state-of-the-art measures and algorithms on 13 UCI data sets, which is confirmed by the statistical analysis. The experiment results show that the proposed coupled attribute similarity is generic, and can effectively and efficiently capture the intrinsic and global interactions within and between attributes for especially large-scale categorical data sets. In addition, two new coupled categorical clustering algorithms, i.e., CROCK and CLIMBO are proposed, and they both outperform the original ones in terms of clustering quality on UCI data sets and bibliographic data.

Read full abstract

Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT.

Read full abstract

Categorical Clustering Research Articles

Related Topics

Articles published on Categorical Clustering

Urdu ligature recognition using multi-level agglomerative hierarchical clustering

Mapping forest fires by nonparametric clustering analysis

Order or chaos? Understanding career mobility using categorical clustering and information theory

A Fast and Efficient Method for Training Categorical Radial Basis Function Networks

Erratum to "Categorical time series clustering: Case study of Korean pro-baseball data"

Neural Categorization of Vibrotactile Frequency in Flutter and Vibration Stimulations: An fMRI Study.

범주형 시계열 자료의 군집화: 프로야구 자료의 사례 연구

Revealing structures in narratives: A mixed-methods approach to studying interdisciplinary handoff in critical care

Cluster structures on strata of flag varieties

Evaluation of Modified Categorical Data Fuzzy Clustering Algorithm on the Wisconsin Breast Cancer Dataset

Categories in the pigeon brain: A reverse engineering approach.

Soft subspace clustering of categorical data with probabilistic distance

Space Structure and Clustering of Categorical Data.

Improving the Decision Value of Hierarchical Text Clustering Using Term Overlap Detection

Coupled attribute similarity learning on categorical data.

Evaluation of selected approaches to clustering categorical variables

Adaptation of Periurban Cattle Production Systems to Environmental Changes: Feeding Strategies of Herdsmen in Southern Benin

Art Video Games

Deep supervised, but not unsupervised, models may explain IT cortical representation.

Maintenance of youth-like processing protects against false memory in later adulthood

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Categorical Clustering Research Articles

Related Topics

Articles published on Categorical Clustering

Urdu ligature recognition using multi-level agglomerative hierarchical clustering

Mapping forest fires by nonparametric clustering analysis

Order or chaos? Understanding career mobility using categorical clustering and information theory

A Fast and Efficient Method for Training Categorical Radial Basis Function Networks

Erratum to "Categorical time series clustering: Case study of Korean pro-baseball data"

Neural Categorization of Vibrotactile Frequency in Flutter and Vibration Stimulations: An fMRI Study.

범주형 시계열 자료의 군집화: 프로야구 자료의 사례 연구

Revealing structures in narratives: A mixed-methods approach to studying interdisciplinary handoff in critical care

Cluster structures on strata of flag varieties

Evaluation of Modified Categorical Data Fuzzy Clustering Algorithm on the Wisconsin Breast Cancer Dataset

Categories in the pigeon brain: A reverse engineering approach.

Soft subspace clustering of categorical data with probabilistic distance

Space Structure and Clustering of Categorical Data.

Improving the Decision Value of Hierarchical Text Clustering Using Term Overlap Detection

Coupled attribute similarity learning on categorical data.

Evaluation of selected approaches to clustering categorical variables

Adaptation of Periurban Cattle Production Systems to Environmental Changes: Feeding Strategies of Herdsmen in Southern Benin

Art Video Games

Deep supervised, but not unsupervised, models may explain IT cortical representation.

Maintenance of youth-like processing protects against false memory in later adulthood