Instance-level Constraints Research Articles

An integrated framework for density-based cluster analysis, outlier detection, and data visualization is introduced in this article. The main module consists of an algorithm to compute hierarchical estimates of the level sets of a density, following Hartigan’s classic model of density-contour clusters and trees. Such an algorithm generalizes and improves existing density-based clustering techniques with respect to different aspects. It provides as a result a complete clustering hierarchy composed of all possible density-based clusters following the nonparametric model adopted, for an infinite range of density thresholds. The resulting hierarchy can be easily processed so as to provide multiple ways for data visualization and exploration. It can also be further postprocessed so that: (i) a normalized score of “outlierness” can be assigned to each data object, which unifies both the global and local perspectives of outliers into a single definition; and (ii) a “flat” (i.e., nonhierarchical) clustering solution composed of clusters extracted from local cuts through the cluster tree (possibly corresponding to different density thresholds) can be obtained, either in an unsupervised or in a semisupervised way. In the unsupervised scenario, the algorithm corresponding to this postprocessing module provides a global, optimal solution to the formal problem of maximizing the overall stability of the extracted clusters. If partially labeled objects or instance-level constraints are provided by the user, the algorithm can solve the problem by considering both constraints violations/satisfactions and cluster stability criteria. An asymptotic complexity analysis, both in terms of running time and memory space, is described. Experiments are reported that involve a variety of synthetic and real datasets, including comparisons with state-of-the-art, density-based clustering and (global and local) outlier detection methods.

Read full abstract

In music information retrieval (MIR) an important research topic, which has attracted much attention recently, is the utilization of user-assigned tags, artist-related style, and mood labels, which can be extracted from music listening web sites, such as Last.fm (http://www.last.fm/) and All Music Guide (http://www.allmusic.com/). A fundamental research problem in the area is how to understand the relationships among artists/songs and these different pieces of information. Co-clustering is the problem of simultaneously clustering two types of data (e.g., documents and words, and webpages and urls). We can naturally bring this idea to the situation at hand and consider clustering artists and tags together, artists and styles together, or artists and mood labels together. Once such co-clustering has been successfully completed, one can identify co-existing clusters of artists and tags, styles, or mood labels (T/S/M). For simplicity, we use the acronym T/S/M to refer to tag(s), style(s), or mood(s) for the rest of the paper. When dealing with tags it is worth noticing that some tags are more specific versions of others. This naturally suggests that the tags could be organized in hierarchical clusters. Such hierarchical organizations exist for styles and mood labels, so we will consider hierarchical co-clustering of artists and T/S/M. In this paper, we systematically study the application of hierarchical co-clustering (HCC) methods for organizing the music data. There are two standard strategies for hierarchical clustering. One is the divisive strategy, in which we attempt to divide the input data set into smaller groups recursively, and the other is the agglomerative strategy, in which we attempt to combine initially individually separated data points into larger groups by finding the most closely related pair at each iteration. We will compare these two strategies against each other. We apply a previously known divisive hierarchical co-clustering method and a novel agglomerative hierarchical co-clustering. In addition, we demonstrate that these two methods have the capability of incorporating instance-level constraints to achieve better performance. We perform experiments to show that these two hierarchical co-clustering methods can be effectively deployed for organizing the music data and they present reasonable clustering performance comparing with the other clustering methods. A case study is also conducted to show that HCC provides us a new method to quantify the artist similarity.

Read full abstract

Instance-level Constraints Research Articles

Related Topics

Articles published on Instance-level Constraints

Expert-driven trace clustering with instance-level constraints

Evolutionary Active Constrained Clustering for Obstructive Sleep Apnea Analysis

Robust semi-supervised clustering with polyhedral and circular uncertainty

On the Use of Fuzzy Constraints in Semisupervised Clustering

Music Clustering With Features From Different Information Sources

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Semi-Supervised Affinity Propagation with Soft Instance-Level Constraints.

Manifold Learning for Multivariate Variable-Length Sequences With an Application to Similarity Search.

Evolutionary K-Means with pair-wise constraints

CEVCLUS: evidential clustering with instance-level constraints for relational data

Constrained spectral embedding for K-way data clustering

Scatter/Gather Clustering: Flexibly Incorporating User Feedback to Steer Clustering Results.

Redistricting Using Constrained Polygonal Clustering

Constraint projections for semi-supervised affinity propagation

Hierarchical Co-Clustering: A New Way to Organize the Music Data

Spectral clustering: A semi-supervised approach

Effective semi-supervised document clustering via active learning with instance-level constraints

Music Clustering With Features From Different Information Sources

Learning Assignment Order of Instances for the Constrained K-Means Clustering Algorithm

Fuzzy Clustering and Aggregation of Relational Data With Instance-Level Constraints

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Instance-level Constraints Research Articles

Related Topics

Articles published on Instance-level Constraints

Expert-driven trace clustering with instance-level constraints

Evolutionary Active Constrained Clustering for Obstructive Sleep Apnea Analysis

Robust semi-supervised clustering with polyhedral and circular uncertainty

On the Use of Fuzzy Constraints in Semisupervised Clustering

Music Clustering With Features From Different Information Sources

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Semi-Supervised Affinity Propagation with Soft Instance-Level Constraints.

Manifold Learning for Multivariate Variable-Length Sequences With an Application to Similarity Search.

Evolutionary K-Means with pair-wise constraints

CEVCLUS: evidential clustering with instance-level constraints for relational data

Constrained spectral embedding for K-way data clustering

Scatter/Gather Clustering: Flexibly Incorporating User Feedback to Steer Clustering Results.

Redistricting Using Constrained Polygonal Clustering

Constraint projections for semi-supervised affinity propagation

Hierarchical Co-Clustering: A New Way to Organize the Music Data

Spectral clustering: A semi-supervised approach

Effective semi-supervised document clustering via active learning with instance-level constraints

Music Clustering With Features From Different Information Sources

Learning Assignment Order of Instances for the Constrained K-Means Clustering Algorithm

Fuzzy Clustering and Aggregation of Relational Data With Instance-Level Constraints