Abstract
Logistically, the data associated with biological collections can be divided into three main categories for digitisation: i) Label Data: the data appearing on the specimen on a label or annotation; ii) Curatorial Data: the data appearing on containers, boxes, cabinets and folders which hold the collections; iii) Supplementary Data: the data held separately from the collections in indices, archives and literature. Each of these categories of data have fundamentally different properties within the digitisation framework which have implications for the data capture process. These properties were assessed in relation to alternative data entry workflows and methodologies to create a more efficient and accurate system of data capture. We see a clear benefit in the prioritisation of curatorial data in the data capture process. These data are often only available at the cabinets, they are in a format suitable for allowing rapid data entry, and they result in an accurate cataloguing of the collections. Finally, the capture of a high resolution digital image enables additional data entry to be separated into multiple sweeps, and optical character recognition (OCR) software can be used to facilitate sorting images for fuller data entry, and giving potential for more automated data entry in the future.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Humanities and Arts Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.