Impact of online handwriting recognition performance on text categorization

Sebastián Peña Saldarriaga,Christian Viard-Gaudin,Emmanuel Morin

doi:10.1007/s10032-009-0108-6

Abstract

Today, there is an increasing demand of efficient archival and retrieval methods for online handwritten data. For such tasks, text categorization is of particular interest. The textual data available in online documents can be extracted through online handwriting recognition; however, this process produces errors in the resulting text. This work reports experiments on the categorization of online handwritten documents based on their textual contents. We analyze the effect of word recognition errors on the categorization performances, by comparing the performances of a categorization system with the texts obtained through online handwriting recognition and the same texts available as ground truth. Two well-known categorization algorithms (kNN and SVM) are compared in this work. A subset of the Reuters-21578 corpus consisting of more than 2,000 handwritten documents has been collected for this study. Results show that classification rate loss is not significant, and precision loss is only significant for recall values of 60–80% depending on the noise levels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Impact of online handwriting recognition performance on text categorization

Abstract

Talk to us

Similar Papers

More From: International Journal on Document Analysis and Recognition (IJDAR)

Lead the way for us

Journal: International Journal on Document Analysis and Recognition (IJDAR)	Publication Date: Jan 16, 2010
Citations: 44

Similar Papers

On-line handwritten text categorization
Sebastián Peña Saldarriaga ... Emmanuel Morin
-
Sebastián Peña Saldarriaga, et. al.Sebastián Peña Saldarriaga ... Emmanuel Morin
18 Jan 2009
18 Jan 2009

Categorization of On-Line Handwritten Documents
Sebastián Peña Saldarriaga ... Emmanuel Morin
-
Sebastián Peña Saldarriaga, et. al.Sebastián Peña Saldarriaga ... Emmanuel Morin
01 Sep 2008
01 Sep 2008

A hybrid approach for text categorization by using x2 statistic, principal component analysis and particle swarm optimization

Scientific Research and Essays | VOL. 8

04 Oct 2013
Scientific Research and Essays | VOL. 8

Using top n Recognition Candidates to Categorize On-line Handwritten Documents
Sebastián Peña Saldarriaga ... Christian Viard-Gaudin
-
Sebastián Peña Saldarriaga, et. al.Sebastián Peña Saldarriaga ... Christian Viard-Gaudin
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Impact of online handwriting recognition performance on text categorization

Abstract

Talk to us

Similar Papers

More From: International Journal on Document Analysis and Recognition (IJDAR)