Identification of Relationships Between Patients Through Elements in a Data Warehouse Using the Familial, Associational, and Incidental Relationship (FAIR) Initiative: A Pilot Study.

Thomas M English,Rajani S Sadasivam,Rebecca L Kinney,Ariana Kamberi,Thomas K Houston,Wayne Chan,Michael J Davis

doi:10.2196/medinform.3738

Thomas M English, Rajani S Sadasivam + Show 5 more

Open Access

https://doi.org/10.2196/medinform.3738

Copy DOI

Abstract

BackgroundOver the last several years there has been widespread development of medical data warehouses. Current data warehouses focus on individual cases, but lack the ability to identify family members that could be used for dyadic or familial research. Currently, the patient’s family history in the medical record is the only documentation we have to understand the health status and social habits of their family members. Identifying familial linkages in a phenotypic data warehouse can be valuable in cohort identification and in beginning to understand the interactions of diseases among families.ObjectiveThe goal of the Familial, Associational, & Incidental Relationships (FAIR) initiative is to identify an index set of patients’ relationships through elements in a data warehouse.MethodsUsing a test set of 500 children, we measured the sensitivity and specificity of available linkage algorithm identifiers (eg, insurance identification numbers and phone numbers) and validated this tool/algorithm through a manual chart audit.ResultsOf all the children, 52.4% (262/500) were male, and the mean age of the cohort was 8 years old (SD 5). Of the children, 51.6% (258/500) were identified as white in race. The identifiers used for FAIR were available for the majority of patients: insurance number (483/500, 96.6%), phone number (500/500, 100%), and address (497/500, 99.4%). When utilizing the FAIR tool and various combinations of identifiers, sensitivity ranged from 15.5% (62/401) to 83.8% (336/401), and specificity from 72% (71/99) to 100% (99/99). The preferred method was matching patients using insurance or phone number, which had a sensitivity of 72.1% (289/401) and a specificity of 94% (93/99). Using the Informatics for Integrating Biology and the Bedside (i2b2) warehouse infrastructure, we have now developed a Web app that facilitates FAIR for any index population.ConclusionsFAIR is a valuable research and clinical resource that extends the capabilities of existing data warehouses and lays the groundwork for family-based research. FAIR will expedite studies that would otherwise require registry or manual chart abstraction data sources.

Highlights

OverviewOver the last several years there has been widespread development of medical data warehouses
The identifiers used for FAIR were available for the majority of patients: insurance number (483/500, 96.6%), phone number (500/500, 100%), and address (497/500, 99.4%)
FAIR is a valuable research and clinical resource that extends the capabilities of existing data warehouses and lays the groundwork for family-based research

Summary

Introduction

OverviewOver the last several years there has been widespread development of medical data warehouses. The i2b2 scalable informatics framework enables researchers to use existing clinical data for discovery research that may be combined with genomic data. This framework can be extended for new and unanticipated data types, as well as for functionality [1-3]. Identifying familial linkages in a phenotypic data warehouse can be valuable in cohort identification and in beginning to understand the interactions of diseases among families. Methods: Using a test set of 500 children, we measured the sensitivity and specificity of available linkage algorithm identifiers (eg, insurance identification numbers and phone numbers) and validated this tool/algorithm through a manual chart audit. FAIR will expedite studies that would otherwise require registry or manual chart abstraction data sources

Objectives

Methods

Results

Discussion

Conclusion