Visual language processing (VLP) of ancient manuscripts: Converting collections to windows on the past

Mohamed Cheriet,Reza Farrahi Moghaddam,Rachid Hedjam

doi:10.1109/ieeegcc.2013.6705813

Abstract

Ancient manuscripts constitute a primary carrier of cultural heritage globally, and they are currently being intensively digitized all over the world to ensure their preservation, and, ultimately, the wide accessibility of their content. Critical to this research process are the legibility of the documents in image form, and access to live texts. Several state-of-the-art methods and approaches have been proposed and developed to address the challenges associated with processing these manuscripts. However, there is a huge amount of data involved, and also the high cost and scarcity of human expert feedback and reference data call for the development of fundamental approaches that encompass all these aspects in an objective and tractable manner. In this paper, we propose one such approach, which is a novel framework for the computational pattern analysis of ancient manuscripts that is data-driven, multilevel, self-sustaining, and learning-based, and takes advantage of the large quantities of unprocessed data available. Unlike many approaches, which fast-forward to the processing and analysis of feature vectors, our innovative framework represents a new perspective on the task, which starts from ground zero of the problem, which is the definition of objects. In addition, it leverages the data-driven mining of relations among objects to discover hidden but persistent links between them. The problem is addressed at three main levels. At the lowest level, that of images, it tackles automatic, data-driven enhancement and restoration of document images using spatial, spectral, sparse, and graph-based representations of visual objects. At the second level, which is transliteration, directed graphical models, HMMs, Undirected Random Fields, and spatial relations models are used to extract the live text of manuscript images, which reduces dependency on human experts. Finally, at the highest level, that of network analysis of the relations among objects (from patches and words to manuscripts and writers) involves the search for `social networks' linking manuscripts. Considering this approach under the umbrella of Visual Language Processing (VLP), we hope that it will be further enriched by the research community, in the form of new insights and approaches contributed at the various levels.

Full Text