Abstract

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record using few available training data. To this end, two approaches are proposed. Firstly, three state-of-the-art object detection networks are explored and compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep&Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining u-shaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (sixteenth–eighteenth centuries), as well as on the Esposalles public database, containing 253 Spanish records (seventeenth century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on more challenging documents, especially when trained on a small, non-representative subset. By contrast, Deep&Syntax relies on steady patterns and is therefore able to process a wider range of documents with less training data. When both systems are trained on 120 documents, Deep&Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30%. It also outperforms Mask R-CNN when trained on a database three times smaller. As Deep&Syntax generalizes better, we believe it can be used for massive parish register processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call