Abstract

Studies of complex disorders benefit from integrative analyses of multiple omics data. Yet, sample mix-ups frequently occur in multi-omics studies, weakening statistical power and risking false findings. Accurately aligning sample information, genotype, and corresponding omics data is critical for integrative analyses. We developed DRAMS (https://github.com/Yi-Jiang/DRAMS) to Detect and Re-Align Mixed-up Samples to address the sample mix-up problem. It uses a logistic regression model followed by a modified topological sorting algorithm to identify the potential true IDs based on data relationships of multi-omics. According to tests using simulated data, the more types of omics data used or the smaller the proportion of mix-ups, the better that DRAMS performs. Applying DRAMS to real data from the PsychENCODE BrainGVEX project, we detected and corrected 201 (12.5% of total data generated) mix-ups. Of the 21 mix-ups involving errors of racial identity, DRAMS re-assigned all data to the correct racial group in the 1000 Genomes project. In doing so, quantitative trait loci (QTL) (FDR<0.01) increased by an average of 1.62-fold. The use of DRAMS in multi-omics studies will strengthen statistical power of the study and improve quality of the results. Even though very limited studies have multi-omics data in place, we expect such data will increase quickly with the needs of DRAMS.

Highlights

  • Investigation of complex traits and disorders can use multiple omics data to systematically explore regulatory networks and causal relationships

  • Sample mix-up happens inevitably during sample collection, processing, and data management. It leads to reduced statistical power and sometimes false findings

  • The goal of DRAMS is to detect and re-align mix-ups based on the grounds that all omics data originating from the same individual should match genotypes

Read more

Summary

Introduction

Investigation of complex traits and disorders can use multiple omics data to systematically explore regulatory networks and causal relationships. Is the detection and realignment of errors in data identifications (IDs) critical to ensuring accurate findings in integrative studies, such corrections can increase statistical power the number of positive findings [1]. For multi-omics data, the sample re-alignment procedure can be generally divided into two steps: first, to estimate genetic relatedness among the data of different omics and group together all the data of the same individual; to assign potential IDs for each data group. It is well-known that genetic information from the same individual should be identical regardless of the omics from which it originated. Using genotype data as a mediator, data originated from the same individual can be grouped together

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.