Abstract
The delineation of precipitation regions is to identify homogeneous zones in which the characteristics of the process are statistically similar. The regionalization process has three main components: (i) delineation of regions using clustering algorithms, (ii) determining the optimal number of regions using cluster validity indices (CVIs), and (iii) validation of regions for homogeneity using L-moments ratio test. The identification of the optimal number of clusters will significantly affect the homogeneity of the regions. The objective of this study is to investigate the performance of the various CVIs in identifying the optimal number of clusters, which maximizes the homogeneity of the precipitation regions. The k-means clustering algorithm is adopted to delineate the regions using location-based attributes for two large areas from Canada, namely, the Prairies and the Great Lakes-St Lawrence lowlands (GL-SL) region. The seasonal precipitation data for 55 years (1951–2005) is derived using high-resolution ANUSPLIN gridded point data for Canada. The results indicate that the optimal number of clusters and the regional homogeneity depends on the CVI adopted. Among 42 cluster indices considered, 15 of them outperform in identifying the homogeneous precipitation regions. The Dunn, D e t _ r a t i o and Trace( W − 1 B ) indices found to be the best for all seasons in both the regions.
Highlights
In hydro-climatology studies, a reliable estimate of precipitation is useful in planning, design, and management of urban water infrastructure, integrated watershed management, and analysis of extremes
The main objective of this study is to evaluate the performance of the cluster validity indices to improve homogeneity of the delineated precipitation regions
The Digital Elevation Model over the study regions is obtained from the Canadian Digital Elevation Data (CDED), which can be accessed from the Canadian GeoBase website
Summary
In hydro-climatology studies, a reliable estimate of precipitation is useful in planning, design, and management of urban water infrastructure, integrated watershed management, and analysis of extremes. The classical statistical methods [1,2] used to model the complex precipitation processes fail to capture the spatial statistics of the region [3]. These methods require the inherent assumption that the process is Gaussian, which is not true in many practical applications [4]. On the other hand the advent of data mining methods such as clustering, Prinicipal Component Analysis, multisite boostrap, etc., are able to capture the complex characteristics of hydroclimatic variables such as precipitation, without any underlying assumptions of the process [5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.