Abstract
Abstract Reliability issues emerge when investigating initials in transcriptions of ENHG manuscripts. Our goal is to demonstrate how a methodological framework of reliability classes affects and facilitates the evaluation of annotated corpus data based on existing text editions. We hypothesize that deviant measurements in the capitalization of sentence-internal word tokens occur when reliability classes and the representativeness of allographs are accounted for. Our data, derived from SIGS project annotations, shows that the capitalization mapped per part of speech can be represented by the size of a set of allographs, thus pointing out the importance of the letter as discrete factor for each text of the corpus.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have