Abstract

The RNA-recognition motif (RRM) is the most abundant RNA-binding domain involved in many post-transcriptional processes. Since RRM-containing proteins have different functions with similar domain architecture, it is challenging to implement an automated annotation tool for these proteins in proteomic analysis. In this study, we implemented a proteomic analyzing pipeline to identify proteins with multiple RRMs and predict their domain boundaries using specific PSSMs, domain architectures, and proteins with the same entity name. After clustering sequences on the basis of their evolutionary distances, a reference group is selected comparing domain architectures. Then, candidate proteins are collected in a proteome using specific PSSMs from seed alignments in PFAM. Finally, target proteins are identified using multiple alignments and phyolgenetic trees between candidate and reference proteins. Therefore, we identified 33 proteins close to 12 types of RRM containing proteins and their domain boundaries among 508 candidates from 33610 sequences in a human proteome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call