Abstract

Historical documents refer to records or books that provide textual information about the thoughts and consciousness of past civilisations, and therefore, they have historical significance. These documents are used as key sources for historical studies as they provide information over several historical periods. Many studies have analysed various historical documents using deep learning; however, studies that employ changes in information over time are lacking. In this study, we propose a deep-learning approach using improved dynamic word embedding to determine the characteristics of 27 kings mentioned in the Annals of the Joseon Dynasty, which contains a record of 500 years. The characteristics of words for each king were quantitated based on dynamic word embedding; further, this information was applied to named entity recognition and neural machine translation.In experiments, we confirmed that the method we proposed showed better performance than other methods. In the named entity recognition task, the F1-score was 0.68; in the neural machine translation task, the BLEU4 score was 0.34. We demonstrated that this approach can be used to extract information about diplomatic relationships with neighbouring countries and the economic conditions of the Joseon Dynasty.

Highlights

  • Historical documents—besides being old texts—carry considerable information, including observations of ideology and phenomena; this information can be used for reconstructing the past

  • We found that the application of parameters obtained from the named entity recognition (NER) model integrated with the improved dynamic word embedding (DWE) enhanced the effectiveness of historical document translations

  • Method; DW2V is dynamic word embedding from Yao et al (2018) [32]; and DBE is dynamic Bernoulli embedding from Rudolph et al (2017) [33]

Read more

Summary

Introduction

Historical documents—besides being old texts—carry considerable information, including observations of ideology and phenomena; this information can be used for reconstructing the past. Historical documents generally maintain an account of long-term records; for example, the Journal of the Royal Secretariat contains approximately 300 years of records from 1623 to 1910; the Ming Shilu provides us with nearly 300 years of records from 1368 to 1644. These historical documents were analysed to determine information related to specific periods or to long periods of time as a longitudinal study. Knowledge about the changing meaning of words is an important factor for deciphering historical documents written over long

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call