강수지역 구분을 위한 최적 자료 전처리 기법 분석

Ug-Gi Kim,Chae-Young Lee,Myoung-Jin Um,Won-Sik Ahn

doi:10.9798/kosham.2012.12.5.233

Abstract

본 연구에서는 우리나라 강수지역 구분을 위한 군집해석시 최적 자료 전처리 기법에 대하여 파악하고자 하였다. 이를 위하여 전국 기상청 관할의 75개 관측소의 지형 및 기상자료를 활용하였다. 적용된 자료 전처리 기법은 4가지로 일반 정규화 방법, 수정 정규화 방법, 표준화 방법 및 요인분석이다. 전처리된 자료를 K-means 군집분석을 통하여 군집을 구분한 후 유효성 측도인 Dunn 지수 및 Silhouette 지수를 통하여 효율성을 분석하였다. 군집수를 3개에서 9개까지 1개씩 늘려가며 분석한 결과 모든 경우에서 요인분석을 통한 자료가 최적의 효율성을 나타내었으나, 최적 군집개수의 산정에는 다소 부족한 것으로 나타났다. In this study, the data preprocessing methods were analyzed to obtain the optimal clustering solution in South Korea. The geographic data and weather data in 75 stations of Korea Meteorological Administration are applied. The applied data preprocessing methods are general normalization, modified normalization, standardization and factor analysis. After the clustering analysis were conducted by K-mean method with preprocessing data, the efficiency of data preprocessing methods are estimated using the clustering index, such as Dunn index and Silhouette index. The clustering analysis are carried out as the cluster number changes from 3 to 9. Among the data preprocessing methods, the data by factor analysis shows the best efficiency for clustering analysis. However, it is not enough to find the optimal cluster number.

Full Text