새로운 초기치 선정 방법을 이용한 향상된 EM 알고리즘

Sung-Soo Kim,Jee-Hye Kang

doi:10.5391/jkiis.2003.13.4.416

Abstract

본 논문은 시스템 공학의 인식에 관련된 여러 분야에서 널리 쓰이는 클러스터링 기법인 Expectation-Maximization의 초기값 설정문제에 관하여 새로운 방법을 제안한다. 기존의 임의로 지정하는 랜덤한 초기치 선정 문제점을 지적하고, 새로이 제안하는 균등 영역 분할과 분할 된 데이터의 통계적 특성을 이용한 초기치 설정 방법을 사용한 새로운 EM 알고리즘을 제안한다. 일반적으로 EM에서 초기값 설정 방법으로 랜덤한 설정 방식의 약점을 보완하기 위하여 K-means 방법을 많이 사용하고 있다. 하지만, K-means 초기치 설정 방법도 근본적인 문제는 해결하지 못하고 있다. 이러한 문제의 하나의 해결 방안으로 논문이 제안한 균등 분할 및 통계적 특성을 이용한 초기치 선정의 방법을 EM 알고리즘에 적용하였다. 제안된 방법은 기존보다 EM 알고리즘의 특성을 극대화하는 방향으로 더 좋은 결과를 가져온다. 본 논문에서 제안된 알고리즘의 우수성을 제안한 초기치 선정 방법을 적용한 EM과 기존 EM의 시뮬레이션 결과를 비교 분석하여 그 우수성을 제시하였다. In this paper we propose a new method for choosing the initial values of Expectation-Maximization(EM) algorithm that has been used in various applications for clustering. Conventionally, the initial values were chosen randomly, which sometimes yields undesired local convergence. Later, K-means clustering method was employed to choose better initial values, which is currently widely used. However the method using K-means still has the same problem of converging to local points. In order to resolve this problem, a new method of initializing values for the EM process. The proposed method not only strengthens the characteristics of EM such that the number of iteration is reduced in great amount but also removes the possibility of falling into local convergence.

Full Text