Abstract
Deep learning in bioinformatics is often limited to problems where extensive amounts of labeled data are available for supervised classification. By exploiting unlabeled data, self-supervised learning techniques can improve the performance of machine learning models in the presence of limited labeled data. Although many self-supervised learning methods have been suggested before, they have failed to exploit the unique characteristics of genomic data. Therefore, we introduce Self-GenomeNet, a self-supervised learning technique that is custom-tailored for genomic data. Self-GenomeNet leverages reverse-complement sequences and effectively learns short- and long-term dependencies by predicting targets of different lengths. Self-GenomeNet performs better than other self-supervised methods in data-scarce genomic tasks and outperforms standard supervised training with ~10 times fewer labeled training data. Furthermore, the learned representations generalize well to new datasets and tasks. These findings suggest that Self-GenomeNet is well suited for large-scale, unlabeled genomic datasets and could substantially improve the performance of genomic models.
Full Text
Topics from this Paper
Self-supervised Learning Method
Self-supervised Learning
Self-supervised Learning Technique
Deep Learning In Bioinformatics
Performance Of Machine Learning Models
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Plant Phenomics
Jan 1, 2023
Moving Image & Technology (MINT)
Aug 31, 2022
IEEE Transactions on Pattern Analysis and Machine Intelligence
Nov 1, 2021
Medical Image Analysis
Oct 1, 2022
Medical image analysis
Jul 20, 2022
Journal of Power Sources
Dec 1, 2021
Frontiers in Microbiology
Dec 1, 2022
Oct 1, 2021
Frontiers in Plant Science
Apr 14, 2023
IEEE Transactions on Geoscience and Remote Sensing
Jan 1, 2022
arXiv (Cornell University)
May 12, 2023
Information Sciences
Apr 1, 2023
Neurocomputing
Feb 1, 2023
Journal of Electronic Research and Application
Aug 17, 2021
Communications Biology
Communications Biology
Nov 25, 2023
Communications Biology
Nov 25, 2023
Communications Biology
Nov 25, 2023
Communications Biology
Nov 25, 2023
Communications Biology
Nov 25, 2023
Communications Biology
Nov 24, 2023
Communications Biology
Nov 24, 2023
Communications Biology
Nov 24, 2023
Communications Biology
Nov 24, 2023
Communications Biology
Nov 24, 2023