Robust Clustering Technique Research Articles

AbstractDue to the severity related to extreme flood events, recent efforts have focused on the development of reliable methods for design flood estimation. Historical streamflow series correspond to the most reliable information source for such estimation; however, they have temporal and spatial limitations that may be minimized by means of regional flood frequency analysis (RFFA). Several studies have emphasized that the identification of hydrologically homogeneous regions is the most important and challenging step in an RFFA. This study aims to identify state‐of‐the‐art clustering techniques (e.g., K‐means, partition around medoids, fuzzy C‐means, K‐harmonic means, and genetic K‐means) with potential to form hydrologically homogeneous regions for flood regionalization in Southern Brazil. The applicability of some probability density function, such as generalized extreme value, generalized logistic, generalized normal, and Pearson type 3, was evaluated based on the regions formed. Among all the 15 possible combinations of the aforementioned clustering techniques and the Euclidian, Mahalanobis, and Manhattan distance measures, the five best were selected. Several watersheds' physiographic and climatological attributes were chosen to derive multiple regression equations for all the combinations. The accuracy of the equations was quantified with respect to adjusted coefficient of determination, root mean square error, and Nash–Sutcliffe coefficient, whereas, a cross‐validation procedure was applied to check their reliability. It was concluded that reliable results were obtained when using robust clustering techniques based on fuzzy logic (e.g., K‐harmonic means), which have not been commonly used in RFFA. Furthermore, the probability density functions were capable of representing the regional annual maximum streamflows. Drainage area, main river length, and mean altitude of the watershed were the most recurrent attributes for modelling of mean annual maximum streamflow. Finally, an integration of all the five best combinations stands out as a robust, reliable, and simple tool for estimation of design floods.

Read full abstract

Finding meaningful clustering patterns in data can be very challenging when the clusters are of arbitrary shapes, different sizes, or densities, and especially when the data set contains high percentage (e.g., 80%) of noise. Unfortunately, most existing clustering techniques cannot properly handle this tough situation and often result in dramatically deteriorating performance. In this paper, a purposefully designed clustering algorithm called Density-Based Multiscale Analysis for Clustering (DBMAC)-II is proposed, which is an improved version of the latest strong-noise clustering algorithm DBMAC. DBMAC is proposed under the assumption that all clusters are homogeneous and cannot work well when the data set contains clusters of varying densities. DBMAC-II overcomes the limitation of DBMAC by executing the multiscale analysis iteratively and can conduct strong noise-robust clustering without any strict assumption on the shapes and densities of clusters. In DBMAC-II, each data point or object is mapped into a feature space using its ${r}$ -neighborhood statistics with different ${r}$ (radius) values, which is similar to DBMAC. In general, the higher the value of ${r}$ -neighborhood statistics, the more likely the object is considered as a “clustered” object. Instead of trying to find a single optimal ${r}$ value, a set of radius values appropriate for separating “clustered” objects and “noisy” objects is identified, using a formal statistical method for multimodality test, referred to as multiscale analysis. For clusters with varying densities, multiscale analysis is applied to extract the clusters with the highest density from the current data set iteratively. Moreover, a statistical uniformity test for measuring clustering tendency is used as the self-adaptive stopping criterion of the iteration. Comprehensive experimental studies on a series of challenging benchmark data sets demonstrate that DBMAC-II is not only superior to classical density-based clustering approaches, including DBSCAN, OPTICS, and HDBSCAN, but also can consistently outperform the latest strong-noise robust clustering techniques, such as Skinny-dip.

Read full abstract

Robust Clustering Technique Research Articles

Related Topics

Articles published on Robust Clustering Technique

Robust clustering based on trimming

Clustering and Financial Performance Analysis of Indonesian Coal Mining Industry Stock Prices

Semi-Supervised Clustering-Based DANA Algorithm for Data Gathering and Disease Detection in Healthcare Wireless Sensor Networks (WSN).

Development of a Computerized Diagnostic System for Brain MRI Tumor Scanning Using a Robust Information Clustering Technique

Improved space breakdown method - A robust clustering technique for spike sorting.

A Kemeny Distance-Based Robust Fuzzy Clustering for Preference Data

A fuzzy clustering technique for enhancing the convergence performance by using improved Fuzzy c-means and Particle Swarm Optimization algorithms

Capturing the multidimensionality of motivation in physical education: A self-organizing maps approach to profiling students

A Robust Tensor-Based Submodule Clustering for Imaging Data Using l12 Regularization and Simultaneous Noise Recovery via Sparse and Low Rank Decomposition Approach.

Clustering Accelerometer Activity Patterns from the UK Biobank Cohort.

Robust clustering for assessing the spatiotemporal variability of groundwater quantity and quality

Noise robust image clustering based on reweighted low rank tensor approximation and $$l_{\frac{1}{2}}$$ regularization

DeepPipes: Learning 3D pipelines reconstruction from point clouds

Detection of Auction Fraud in Commercial Sites

Distributed robust data clustering in wireless sensor networks using diffusion moth flame optimization

Application of self-organizing map and fuzzy c-mean techniques for rockburst clustering in deep underground projects

Artificial intelligence for identifying hydrologically homogeneous regions: A state‐of‐the‐art regional flood frequency analysis

Effective Kernel-Based Fuzzy Clustering Systems in Analyzing Cancer Database

High Dimensional Cluster Analysis Using Path Lengths

Density-Based Multiscale Analysis for Clustering in Strong Noise Settings With Varying Densities

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Robust Clustering Technique Research Articles

Related Topics

Articles published on Robust Clustering Technique

Robust clustering based on trimming

Clustering and Financial Performance Analysis of Indonesian Coal Mining Industry Stock Prices

Semi-Supervised Clustering-Based DANA Algorithm for Data Gathering and Disease Detection in Healthcare Wireless Sensor Networks (WSN).

Development of a Computerized Diagnostic System for Brain MRI Tumor Scanning Using a Robust Information Clustering Technique

Improved space breakdown method - A robust clustering technique for spike sorting.

A Kemeny Distance-Based Robust Fuzzy Clustering for Preference Data

A fuzzy clustering technique for enhancing the convergence performance by using improved Fuzzy c-means and Particle Swarm Optimization algorithms

Capturing the multidimensionality of motivation in physical education: A self-organizing maps approach to profiling students

A Robust Tensor-Based Submodule Clustering for Imaging Data Using l12 Regularization and Simultaneous Noise Recovery via Sparse and Low Rank Decomposition Approach.

Clustering Accelerometer Activity Patterns from the UK Biobank Cohort.

Robust clustering for assessing the spatiotemporal variability of groundwater quantity and quality

Noise robust image clustering based on reweighted low rank tensor approximation and $$l_{\frac{1}{2}}$$ regularization

DeepPipes: Learning 3D pipelines reconstruction from point clouds

Detection of Auction Fraud in Commercial Sites

Distributed robust data clustering in wireless sensor networks using diffusion moth flame optimization

Application of self-organizing map and fuzzy c-mean techniques for rockburst clustering in deep underground projects

Artificial intelligence for identifying hydrologically homogeneous regions: A state‐of‐the‐art regional flood frequency analysis

Effective Kernel-Based Fuzzy Clustering Systems in Analyzing Cancer Database

High Dimensional Cluster Analysis Using Path Lengths

Density-Based Multiscale Analysis for Clustering in Strong Noise Settings With Varying Densities