Irrelevant Clusters Research Articles

In the recent past, different kernelized versions of c-means (hard and fuzzy) clustering algorithms have been proposed. Here, we focus on kernel clustering of only object data, X = { x <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> , ...,xn } ⊂ R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sup> . We first raise a basic question: Should we really cluster any given object data in the kernel space? The answer is NO! Here are our line of arguments: 1) The objective of any clustering algorithm is to find natural subgroups in X, where the subgroups are defined by a measure of similarity between the vectors in X. 2) If we transform the data X into Y in another space by a nonlinear transformation and try to find clusters in Y, then such clusters can be useful if and only if Y helps us to find the same clusters that are present in X because that is our objective. 3) If Y maintains the same structure/topology as that of X, then the use of Y may not give any advantage. 4) On the other hand, if Y changes the structure (i.e., imposes a new structure) on the data and that change makes the extraction of the desired clusters present in X easier, then clustering of Y is useful. 5) But when Y imposes new (nonexistent) structures, the clustering algorithm may find very strange clusters with no relation to the actual clusters present in X. 6) Thus, when we try to cluster in a transformed space, the issue is to know if it could help us to find the clusters present in X. To get any benefit from kernel clustering (or clustering in any other transformed space), we need to answer this question first; otherwise, we may find completely irrelevant clusters without knowing it and thereby making kernel clustering useless. 7) This issue is a philosophical one and is neither dependent on the choice of clustering algorithm nor on the particular transformation (kernel function) used. 8) Except for 2-D/3-D data, we do not know of any way to answer the question in 6) and for 2-D/3-D data, since we can look at the data, we do not need kernel clustering. Therefore, there is no benefit from kernel clustering. We demonstrate and justify our claims using both synthetic and real datasets with visual assessment as well as with normalized mutual information, adjusted Rand index, and cluster instability. We propose to use Sammon's nonlinear projection method to get a crude visual representation of the data in the kernel space. We discuss the issue of how to choose appropriate parameters of the kernel function, but we could not provide a solution to this problem. Finally, we discuss how the kernel parameters and the algorithmic parameters interact.

Read full abstract

Abstract. The experiences from recent disaster events showed that detailed information derived from high-resolution satellite images could accommodate the requirements from damage analysts and disaster management practitioners. Richer information contained in such high-resolution images, however, increases the complexity of image analysis. As a result, few image analysis solutions can be practically used under time pressure in the context of post-disaster and emergency responses. To fill the gap in employment of remote sensing in disaster response, this research develops a rapid high-resolution satellite mapping solution built upon a dual-scale contextual framework to support damage estimation after a catastrophe. The target objects are building (or building blocks) and their condition. On the coarse processing level, statistical region merging deployed to group pixels into a number of coarse clusters. Based on majority rule of vegetation index, water and shadow index, it is possible to eliminate the irrelevant clusters. The remaining clusters likely consist of building structures and others. On the fine processing level details, within each considering clusters, smaller objects are formed using morphological analysis. Numerous indicators including spectral, textural and shape indices are computed to be used in a rule-based object classification. Computation time of raster-based analysis highly depends on the image size or number of processed pixels in order words. Breaking into 2 level processing helps to reduce the processed number of pixels and the redundancy of processing irrelevant information. In addition, it allows a data- and tasks- based parallel implementation. The performance is demonstrated with QuickBird images captured a disaster-affected area of Phanga, Thailand by the 2004 Indian Ocean tsunami are used for demonstration of the performance. The developed solution will be implemented in different platforms as well as a web processing service for operational uses.

Read full abstract

Irrelevant Clusters Research Articles

Related Topics

Articles published on Irrelevant Clusters

Presyndromic surveillance for improved detection of emerging public health threats.

What and When Can We Gain From the Kernel Versions of C-Means Algorithm?

RAPID DISASTER DAMAGE ESTIMATION

Adaptive Cluster Distance Bounding for High-Dimensional Indexing

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Irrelevant Clusters Research Articles

Related Topics

Articles published on Irrelevant Clusters

Presyndromic surveillance for improved detection of emerging public health threats.

What and When Can We Gain From the Kernel Versions of C-Means Algorithm?

RAPID DISASTER DAMAGE ESTIMATION

Adaptive Cluster Distance Bounding for High-Dimensional Indexing