Abstract

As information and communications technology has advanced, there is increased interest in digitally archiving books and other materials that previously have never been archived in such a way. This is beneficial to researchers, teachers, students and the general public, enabling them to easily access useful historical information. The digital archiving of old newspapers is a work in progress but there are obstacles to this as scanning fonts from 1850, for example, using optical character recognition (OCR), which is the main method used to convert materials to text, is challenging and it's not currently possible to perform a full text search. Professor Kazuki Joe, Department of Information and Computer Sciences, Nara Women's University, Japan, leads a team of researchers that are working to make it possible to perform full text searches for early-modern books, magazines and newspapers. This is an especially difficult task as the team is working with Japanese texts and the early-modern writing style in Japan is different from that of today. As such, the researchers first focused on the automatic conversion of letterpress book images into text and then realised the need for automatic translation of early-modern literary texts into present colloquialisms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call