Abstract

DNA methylation is an epigenetic mechanism for regulating gene expression, and it plays an important role in many biological processes. While methylation sites can be identified using laboratory techniques, much work is being done on developing computational approaches using machine learning. Here, we present a deep-learning algorithm for determining the 5-methylcytosine status of a DNA sequence. We propose an ensemble framework that treats the self-attention score as an explicit feature that is added to the encoder layer generated by fine-tuned language models. We evaluate the performance of the model under different data distribution scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call