Abstract

The valid assimilation of data from heterogeneous databases is a critical informatics issue with implications for both clinical and population research. Heterogeneous databases have representational and semantic differences whose resolution requires mapping the intended meanings of local data structures to standard reference models. We argue that the representation and use of contextual information enhances this mapping process, and may indeed be a necessary part of this process to maintain meaning. A database integration project to assimilate presenting complaint data from heterogeneous emergency department databases afforded the opportunity to characterize multiple levels of context that were important in identifying and resolving representational and semantic differences. Consideration and representation of context at multiple levels can preserve granularity and intended meaning of local data at the aggregate level, and increase the quality and utility of the data for secondary analyses. We propose five aspects of context (data instance context, context of database schema, context of data collection process, context of data collection quality, and domain context) which are necessary to understand the merging of heterogeneous data. Ideal representations for each aspect of context vary by purpose and domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call