Abstract The mindat.org website (Mindat) has been operating since October 2000 as a free, crowd-sourced, and expert-curated database particularly focused on mineral species and their occurrences worldwide. The project has transformed from a hobbyist site in the beginning into a resource that has found use in various scientific research projects and educational programs. Together with other open data resources, Mindat has helped accelerate scientific discoveries in many fields, such as mineral evolution, mineral ecology, and the co-evolution of the geosphere and biosphere. Recently, through open data efforts, machine interfaces and software packages have been established to enable flexible data discovery and download from Mindat. We assume the data access and usage will further scale up in the next years. Although Mindat is curated by a team of geoscience and database experts across the world, the crowd-sourced records in Mindat still pose some bias. In this paper we first present an overview of the primary data subjects in Mindat, and then we give extensive details about the characteristics and bias of three most popular data subjects: locality, mineral species, and mineral occurrence. In the discussion we also give an outlook on appropriate data usage and future extension of data records. We hope users can obtain a more comprehensive view of the Mindat database through this paper and thus better plan their data use. We also hope more people will be inspired to contribute to the data curation work to make Mindat a sustained data ecosystem for geoscience research.
Read full abstract