분산맵을 이용한 웹 이미지 텍스트 영역 추출

In-Sook Jung,Il-Seok Oh

doi:10.5392/jkca.2009.9.9.068

Abstract

분산맵은 텍스트 영역이 주변과의 색상 혹은 밝기 변화가 심하다는 특징을 이용하는 방법으로 특히 잦은 포맷 변환에 의하여 해상도가 낮거나 일정하지 않은 웹 이미지의 텍스트 영역을 추출하는 데 적용할 수 있다. 그러나 이전의 분산맵을 적용한 방법들은 입력 영상 전역에 고정된 마스크를 한 번만 적용하는 광역 분산맵을 사용하므로 텍스트 크기가 매우 작거나 큰 경우, 획의 색상에 gradation효과가 있는 경우, 각도, 위치, 색상 등이 복잡한 경우 텍스트 추출 성능이 안정 적이지 못하다. 본 논문은 2단계 분산맵을 사용하여 Web 이미지에서 텍스트 영역을 안정적으로 추출하는 방법을 제안한다. 제안된 방법은 광역 및 지역 분산맵이 각 단계에서 적용되며 서로 계층적 관계를 가진다. 1단계는 텍스트 영역 추출 재현율을 높일 수 있도록, 충분히 큰 글자 혹은 작은 글자도 추출할 수 있는 일정한 마스크 크기를 가진 광역의 수직 및 수평 색 분산맵을 적용하여 유사 텍스트 영역을 추출한다. 2단계에서는 1단계의 각 연결요소영역에 새로운 마스크 크기를 가진 명암 분산맵을 적용하여 최종적인 텍스트 영역을 추출한다. 2단계 분산맵 적용에 의하여 1단계에서 구한 유사 텍스트 영역에 남아 있는 배경 부분이 많이 사라지게 되어 추출 정확률이 높아진다. 제안한 방법을 400개의 Web 이미지에 적용한 결과 배경이 복잡해도 비교적 안정적으로 텍스트 영역을 추출하는 것을 확인할 수 있었다. A variance map can be used to detect and distinguish texts from background in images. However, previous variance maps work at one level and they suffer a limitation in dealing with varieties in text size, slant, orientation, translation, and color. We present a method for robustly segmenting text regions in complex color Web images using two-level variance maps. The two-level variance maps work hierarchically. The first level finds the approximate locations of text regions using global horizontal and vertical color variances with the specific mask sizes. The second level then segments each text region using intensity variance with a local mask size, which is determined adaptively. By the second process, backgrounds tend to disappear in each region and segmentation can be accurate. Highly promising experimental results have established the effectiveness of our approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

분산맵을 이용한 웹 이미지 텍스트 영역 추출

Abstract

Talk to us

Similar Papers

More From: The Journal of the Korea Contents Association

Lead the way for us

Similar Papers

Text/Image Region Separation for Document Layout Detection of Old Document Images Using Non-linear Diffusion and Level Set
S Sachin Kumar ... K.P Soman
Procedia Computer Science | VOL. 93
S Sachin Kumar, et. al.S Sachin Kumar ... K.P Soman
01 Jan 2015
Procedia Computer Science | VOL. 93

A new text location method in natural scene images based on color reduction and AdaBoost
Jiakai Gao ... Lei Yang
-
Jiakai Gao, et. al.Jiakai Gao ... Lei Yang
01 Nov 2016
01 Nov 2016

Natural scene text localization using edge color signature
...
International Journal of Nonlinear Analysis and Applications | VOL. 10
, et. al. ...
01 Nov 2019
International Journal of Nonlinear Analysis and Applications | VOL. 10

Text Extraction in Complex Color Document Images for Enhanced Readability
P Nagabhushan ... S Nirmala
Intelligent Information Management | VOL. 02
P Nagabhushan, et. al.P Nagabhushan ... S Nirmala
01 Jan 2009
Intelligent Information Management | VOL. 02

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

분산맵을 이용한 웹 이미지 텍스트 영역 추출

Abstract

Talk to us

Similar Papers

More From: The Journal of the Korea Contents Association