Zinc finger (ZnF) is an important class of nucleic acid and protein recognition domain, wherein, zinc ion is the inorganic co-factor that forms a tetrahedral geometry with the cysteine and/or histidine residues. ZnF domains take up diverse architectures with different ZnF motifs and have a wide range of biological functions. Nonetheless, predicting the ZnF motif(s) from the sequence is quite challenging. To this end, 74 unique ZnF sequence patterns are collected from the literature and classified into 32 different classes. Since the shorter length of ZnF sequence patterns leads to inaccurate predictions, ZnF domain Pfam HMM profiles defined under 6 ZnF Pfam clans (215 HMM profiles) and a few undefined Pfam clans (74 HMM profiles) are used for the prediction. A web server, namely, ZnF-Prot (https://project.iith.ac.in/znprot/) is developed which can predict the presence of 31 ZnF domains in a protein/proteome sequence of any organism. The use of ZnF sequence patterns and Pfam HMM profiles resulted in an accurate prediction of 610 test cases (taken randomly from 249 organisms) considered here. Additionally, the application of ZnF-Prot is demonstrated by considering Arabidopsis thaliana, Homo sapiens, Saccharomyces cerevisiae, Caenorhabditis elegans and Ciona intestinalis proteomes as test cases, wherein, 87–96% of the predicted ZnF motifs are cross-validated.
Read full abstract