Abstract
Scientists and policymakers are increasingly leveraging complex, multi-scale data from diverse, worldwide sources to understand the causes and consequences of economic development, social stratification, climate change, cultural diversity, and violent conflict. This work frequently requires integrating data across diverse datasets by complex, dynamic categories (e.g., ethnicities, languages, religions, subdistricts). However, different datasets encode corresponding categories in disparate formats and at different resolutions (e.g., Guatemala Indigenous vs. Maya vs. K’iche’). These diverse encodings must be translated across datasets before bringing them together for analysis. At global scales across thousands of categories, the combinatorial complexity creates thorny challenges for manual reconciliation and for transparent documentation and sharing of researcher decisions. There is a need to investigate direct and uncomplicated ways to support search and explore the semantics for complex and diverse datasets.We design and deploy such a tool, CatMapper, to support semantic discovery through exploration and manipulation for large, complex and diverse datasets. CatMapper enables exploring contextual information about specific categories, translating new sets of categories from existing datasets and published studies, identify and integrating novel combinations of datasets for researchers’ custom needs, including automatically generated syntax to merge datasets of interest, and publishing and sharing merging templates for public re-use and open science. CatMapper does not store observational data. Rather, it is a dynamic, interactive dictionary of keys to help users integrate observational data from diverse external datasets in disparate formats, thereby complementing and leveraging a fast-growing ecology of datasets storing observational data. We have conducted heuristic evaluation on CatMapper usability. Results shed lights on enriching semantic data discovery.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.