Abstract

Clustering process is an important stage for many data mining applications. In this process, data elements are grouped according to their similarities. One of the most known clustering algorithms is the k-means algorithm. The algorithm initially requires the number of clusters as a parameter and runs iteratively. Many remote sensing image processing applications usually need the clustering stage like many image processing applications. Remote sensing images provide more information about the environments with the development of the multispectral sensor and laser technologies. In the dataset used in this paper, the infrared (IR) and the digital surface maps (DSM) are also supplied besides the red (R), the green (G), and the blue (B) color values of the pixels. However, remote sensing images come with very large sizes (6000 $\times$ 6000 pixels for each image in the dataset used). Clustering these large-size images using their multiattributes consumes too much time if it is used directly. In the literature, some studies are available to accelerate the k-means algorithm. One of them is the normalized distance value (NDV)-based fast k-means algorithm that benefits from the speed of the histogram-based approach and uses the multiattributes of the pixels. In this paper, we evaluated the effects of these attributes on the correctness of the clustering process with different color space transformations and distance measurements. We give the success results as peak signal-to-noise ratio and structural similarity index values using two different types of reference data (the source images and the ground-truth images) separately. Finally, we give the results based on accuracy measurement for evaluating both the success of the clustering outputs and the reliability of the NDV-based measurement methods presented in this paper.

Highlights

  • In machine vision systems, extracting meaningful information from digital images obtained by spectral sensors is an important and main occupation for the researchers

  • The results show that the method gives more accurate clustering results compared to the gray-level histogram

  • We evaluate the effects of the attributes of the remote sensing image pixels and the distance norms on the k-means clustering performance using their different combinations and color transformations for normalized distance value (NDV)

Read more

Summary

Introduction

In machine vision systems, extracting meaningful information from digital images obtained by spectral sensors is an important and main occupation for the researchers. Remote sensing images have usually very large sizes and include an enormous number of pixels For this reason, to cluster the remote sensing images in an acceptable time, a fast approach such as the histogram-based k-means method is needed. An attribute vector is determined such as mean attribute vector of the image to be clustered and the distance values, for example, Euclidean distance, are calculated from the elements to the center, separately. We evaluate the effects of the attributes of the remote sensing image pixels and the distance norms (distance measurements) on the k-means clustering performance using their different combinations and color transformations for NDVs. To evaluate the clustering performances, firstly, we use peak signal-to-noise rate (PSNR) and structural similarity index (SSIM) measurements with two different reference data for the resulting clusters [7]. Looking at the accuracy results, the NDV-based success measurement methods can be evaluated in terms of reliability, besides measuring the success of the clustering outputs in an additional way

Background
Color transformation
Gray-level color transformation
Methods
Comparison metrics
Accuracy
Clustering results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call