Abstract

Signal finding (pattern discovery) in biological sequences is a fundamental problem in both computer science and molecular biology. Many approaches have been proposed for extracting interesting patterns (or motifs) from DNA/RNA and protein sequences. Some approaches are based on simple and multiple alignment techniques, some use biological knowledge and others do not.In this paper, we propose a de novo framework that performs motifs identification and exploits a constrained co-clustering technique allowing one to simultaneously find associations between groups of protein sequences and groups of motifs.We show that the presented approach is able to group together protein sequences belonging to the same families and, at the same time to provide a set of characterizing motifs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call