Abstract

Abstract Background With the advent of next-generation sequencing, digital pathology, and other high-dimensional data sources used in modern research, improved data management is required to maintain organization and relationship to the corresponding patient and their outcomes. The complexity of the data often limits interpretation by clinicians not highly versed in its analysis. Conversely, data scientists may not be able to interpret the data in the correct clinical context. We describe here a relational database and custom tools which coordinates clinical, organoid, and genomic data in the study of esophageal adenocarcinoma. Methods Database architecture was designed with pillars of clinical data and outcome, organoid culture data, and next-generation sequencing data. Data dictionaries were composed for all tables in the database architecture. Oracle Database was utilized and populated with datasets formatted to adhere to data dictionaries. Application Express was used to develop web applications for users to enter, query, and analyze data. Automatic nightly backups, robust login security, and tiers of access protect confidential data against system failures, human error, and data theft. Custom tools with graphical user interfaces were developed for survival curve generation and organoid drug-screening results, to facilitate clinician use. Results This database now encompasses over 21 data tables comprising clinical, surgical, recurrence, treatment response, tissue sample storage, organoid research, and genomic data. Data scientists can link omic data directly to patient clinical data and clinician researchers can rapidly obtain patient and organoid survival data for covariates of choice using a graphical user interface. Conclusion Proper data management requires a substantial initial investment in designing of database architecture, deployment of software and servers, formatting of data, and the creation of front-end user applications. The investment is well worth the cost, however, as a centralized database allows for rapid querying of data that would otherwise be scattered or require individualized analysis for each project. This also facilitates the interactions between data scientists and clinician researchers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call