Abstract
Technical differences between gene expression sequencing experiments can cause variations in the data in the form of batch effect biases. These do not represent true biological variations between samples and can lead to false conclusions or hinder the ability to integrate multiple datasets. Since there is a growing need for the joint analysis of single-cell sequencing datasets from different sources, there is also a need to correct the resulting batch effects while maintaining the true biological variations in the data. We developed a semi-supervised deep learning architecture called Autoencoder-based Batch Correction (ABC) for integrating single-cell sequencing datasets. Our method removes batch effects through a guided process of data compression using supervised cell type classifier branches for biological signal retention. It aligns the different batches using an adversarial training approach. We comprehensively evaluate the performance of our method using four single-cell sequencing datasets and multiple measures for batch effect removal and biological variation conservation. ABC outperforms 10 state-of-the-art methods for this task including Seurat, scGen, ComBat, scanorama, scVI, scANVI, AutoClass, Harmony, scDREAMER, and CLEAR, correcting various types of batch effects while preserving intricate biological variations.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.