Abstract

The nucleic acid and protein sequences contain different types of information (genes, RNA structural, active sites, regulatory structure ...), these information can lead to discover many useful knowledge on biology like the functionality of a given protein sequence, another example is toclassifying proteins on different families based on these information. In this paper we focus on the existed motif in the nucleic acid sequences. Before going further it is useful to review the concepts and terminology associated with this study. The motif is a structural short element that could be found in all members of a family of protein. It contains essential residues for function conserved, not necessarily consecutive, but rather closes to the 3D structure, be-cause they involve the same function (active site, binding site ...). While the pattern or profile is a degenerate sequence and/or composed of different motif that can be separated by variable regions. In fact, the objective is to develop a new algorithm based on mining tree structure in order to highlight segments of DNA, RNA, or amino acids, which are likely to have a biological role

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call