Abstract

Segmenting characters in an image is a classic yet challenging task in computer vision. Correctly determining boundaries of adhesive characters with various scales and shapes is essential for character segmentation, especially for separating handwritten characters. Nevertheless, there is seldom work in the literature which can achieve satisfactory performance. In this article, by leveraging the ability of deep neural networks, we proposed a two-stage character segmentation network with two-stream attention and edge refinement (TSER) to tackle this problem. TSER firstly locates every character by object detection, then extracts their corresponding contours. In the process, a novel two-stream attention mechanism (TSAM) is proposed to make the network focus more on the discrepancy of character boundaries. Furthermore, a novel generating method is used to dynamically generate anchors on different feature levels to improve model’s sensitivity on the shapes and scales of characters. Eventually a cascaded edge refinement network is used to obtain contour of each character. To prove the efficiency and generalization ability of our model, we compared TSER with traditional algorithms and other deep learning models on two commonly used datasets in different segmentation tasks. The comparative result indicated that TSER reached state-of-the-art performance.

Highlights

  • Character segmentation is a vital step in traditional optical text recognition process

  • We proposed a novel character segmentation network two-stream attention and edge refinement (TSER) that aims at segmenting characters from text line in images

  • TSER has four major contributions: 1) A novel two-stage segmentation network focusing on character segmentation task was proposed

Read more

Summary

INTRODUCTION

Character segmentation is a vital step in traditional optical text recognition process. We propose a two-stage character segmentation network, which can accurately segment characters under various situations, namely normal spacing, subtle spacing, adhesive characters, partially overlapping characters, characters with deflection angles, and characters with different scales and shapes, from text line images with random noise. A two-stream attention mechanism is proposed to guide the feature selection process of model This attention mechanism contributes to distinguish the boundary of adhesive characters and assists to find out every character instance. A guided anchoring method is applied to produce sparse and appropriate anchors instead of the dense and redundant ones in traditional region proposal network This module can reduce computational cost and generate anchors that meet various character shapes and scales.

RELATED WORK
PROBLEM DEFINITION
OVERALL NETWORK ARCHITECTURE
Calculating area of Bc
OTHER REMEDIES
MODEL TRAINING
EXPERIMENTS
Method
ABLATION STUDY
TIME COST Training
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call