Abstract

Artificial Intelligence (AI) becomes more prevalent in data science as well as in areas of computational science. Commonly used classification methods in AI can also be used for unorganized databases, if a proper model is trained. Most of the classification work is done on image data for purposes such as object detection and face recognition. If an object is well detected from an image, the classification may be done to organize image data. In this work, we try to identify images from an Integrated Digitized Biocollections (iDigBio) dataset and to classify these images to generate metadata to use as an AI-ready dataset in the future. The main problem of the museum image datasets is the lack of metadata information on images, wrong categorization, or poor image quality. By using AI, it maybe possible to overcome these problems. Automatic tools can help find, eliminate or fix these problems. For our example, we trained a model for 10 classes (e.g., complete fish, photograph, notes/labels, X-ray, CT (computerized tomotography) scan, partial fish, fossil, skeleton) by using a manually tagged iDigBio image dataset. After training a model for each for class, we reclassified the dataset by using these trained models. Some of the results are given in Table 1. As can be seen in the table, even manually classified images can be identified as different classes, and some classes are very similar to each other visually such as CT scans and X-rays or fossils and skeletons. Those kind of similarities are very confusing for the human eye as well as AI results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.