Abstract

The NIEHS-supported Human Health and Exposure Analysis Resource (HHEAR) Data Center maintains a public-use data repository to promote reuse of environmental health data generated by the HHEAR program. The creation and maintenance of this repository requires the integration of information from a wide variety of epidemiologic studies. We have developed the Human Aware Data Acquisition Framework to enable this complex integration, supporting harmonization across multiple studies, and enabling meaningful search and access of the data deposited in the HHEAR Data Repository. To integrate data from a new study, investigators engage in an initial, time-consuming effort to link study data to the HHEAR ontology, a controlled vocabulary of environmental and public health terms. This is accomplished by generating a semantic data dictionary (SDD) from the data dictionaries and codebooks provided by HHEAR study investigators. Originally, this had been done manually by an expert in both epidemiological terminology and ontological modeling. To increase the accessibility of these tools for environmental health scientists who lack formal ontologic training, we have developed an SDD-Editor that simplifies the ontology modeling process. The SDD-Editor reuses elements common to epidemiologic data dictionaries and spreadsheet software, while integrating features needed to form semantic links between public health concepts and existing ontologies. The SDD-Editor suggests potential concept matches for study variables within the SDD using natural language processing to capture the semantic similarity between data dictionary and ontology class descriptions. If no suitable suggestion exists, investigators can search for ontology terms using a search engine powered by Bioportal. Once finished, a validator is run to check that the SDD has the correct format and all classes are valid. By automating parts of the ontology modeling process, the SDD-Editor greatly facilitates the dynamic integration of HHEAR environmental health studies into a single repository, benefiting the scientific community.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.