Abstract
Abstract: The identification of significant underlying data patterns such as image composition and spatial arrangements is fundamental in remote sensing tasks. Therefore, the development of an effective approach for information extraction is crucial to achieve this goal. Affinity propagation (AP) algorithm is a novel powerful technique with the ability of handling with unusual data, containing both categorical and numerical attributes. However, AP has some limitations related to the choice of initial preference parameter, occurrence of oscillations and processing of large data sets. This paper evaluates the clustering performance of AP algorithm taking into account the influence of preference parameter and damping factor. The study was conducted considering the AP algorithm, the adaptive AP and partition AP. According to the experiments, the choice of preference and damping greatly influences on the quality and the final number of clusters.
Highlights
Data clustering is one of the fundamental tasks in remote sensing, used for information extraction and classification purposes (Dermoudy et al 2009)
Taking into account that each point Ci ∈ C is transformed into Ci' Rk whose coordinates are: {Yim, m ∈ {1,...,k}}, affinity propagation clustering is applied on this transformed set of points C' to calculate the evidence, and assign each point Ci to the group at which its corresponding transformed point Ci' is assigned
According to the results using simulated data (Table 2), the minimum value of preference for Affinity propagation (AP) algorithm resulted in 5 clusters within 376.912 seconds, while the median preference resulted in 7 clusters taking 126.177 seconds
Summary
Data clustering is one of the fundamental tasks in remote sensing, used for information extraction and classification purposes (Dermoudy et al 2009). In remote sensing, clustering algorithms can be applied in unsupervised classification to divide multispectral and hyperspectral spaces for extraction of patterns associated with land-cover classes (Dey et al 2010; Chehdi et al 2014). These algorithms can be used as a pre-processing step before performing any classification task (Dermoudy et al 2009). Real world applications are complex; most of datasets are mixed containing both numeric and categorical attributes, what makes the Euclidean distance function to fail in judging the similarity between two data points (Zhang and Gu 2014)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.