Abstract

The bulk of the documents that affect our lives are digital or born digital. Our laborious investigations of layout, script, font and graphics, are turning into mere exercises with little influence on pursuits outside the Document Analysis and Recognition (DAR) community. Recent performance improvements on such tasks, even if based on deep learning and AI, are as much the result of advances in computer hardware as of breakthroughs in document research. It is time to automate tasks beyond transcription. This Commentary addresses our mission, our approach to some technical issues, and the role of AI in DAR. Opportunities for a wider role for document analysis include more pervasive application of statistical decision theory, integrated genre analysis, summarization, interpretation and information extraction, bolder goals in content analysis, and alternative modalities, induced by the open source movement, for sharing research results. Importantly, expanding the scope of our research incurs increased responsibility for retaining human prerogatives in critical decision making and preserving essential human skills like good writing and discriminative reading.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.