Abstract

The rapid expansion of biological sequence databases due to high-throughput genomic and proteomic sequencing methods has left a considerable number of identified protein sequences with unclear or incomplete functional annotations. Domains of unknown function (DUFs) are protein domains that lack functional annotations but are present in numerous proteins. To address the challenge of finding functional annotations for DUFs, we have developed a computational method that efficiently identifies and annotates these enigmatic protein domains by utilizing the position-specific iterative basic local alignment search tool (PSI-BLAST) and data mining techniques. Our pipeline identifies putative potential functionalities of DUFs, thereby decreasing the gap between known sequences and functions. The tool can also take user input sequences to annotate. We executed our pipeline on 5111 unique DUF sequences obtained from Pfam, resulting in putative annotations for 2007 of these. These annotations were subsequently incorporated into a comprehensive database and interfaced with a web-based server named "AnnoDUF". AnnoDUF is freely accessible to both academic and industrial users, via the World Wide Web at the link http://bts.ibab.ac.in/annoduf.php. All scripts used in this study are uploaded to the GitHub repository, and these can be accessed from https://github.com/BioToolSuite/AnnoDUF.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.