Abstract

Document analysis tasks for which representative labeled training samples are available have been largely solved. The next frontier is coping with hitherto unseen formats, unusual typefaces, idiosyncratic handwriting and imperfect image acquisition. Adaptive and style-constrained classification methods can overcome some expected variability, but human intervention will remain necessary in many tasks. Interactive pattern recognition includes data exploration and active learning as well as access to stored documents. The principle of “green interaction” is to make use of every intervention to reduce the likelihood that the automated system will make the same mistake again and again. Some of these techniques may pop up in forthcoming personal camera-based memex-like applications that will have a far broader range of input documents and scene text than the current, successful but highly specialized, systems for patents, postal addresses, bank checks and books.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.