Abstract
The number of known cuneiform tablets is assumed to be in the hundreds of thousands. The Hilprecht Archive Online contains 1977 high-resolution 3D scans of tablets. The online cuneiform database CDLI catalogs metadata for more than 100.000 tablets. While both are accessible publicly, large-scale machine learning and pattern recognition on cuneiform tablets remain elusive. The data is only accessible by searching web pages, the tablet identifiers between collections are inconsistent, and the 3D data is unprepared and challenging for automated processing. We pave the way for large-scale analyses of cuneiform tablets by assembling a cross-referenced benchmark dataset of processed cuneiform tablets: (i) frontally aligned 3D tablets with pre-computed high-dimensional surface features, (ii) six-views raster images for off-the-shelf image processing, and (iii) metadata, transcriptions, and transliterations, for a subset of 707 tablets, for learning alignment between 3D data, image and linguistic expression. This is the first dataset of its kind and of its size in cuneiform research. This benchmark dataset is prepared for ease-of-use and immediate availability for computational researches, lowering the barrier to experiment and apply standard methods of analysis, at https://doi.org/10.11588/data/IE8CCN.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.