Abstract
The study of type III RNases constitutes an important area in molecular biology. It is known that the pac1+ gene encodes a particular RNase III that shares low amino acid similarity with other genes despite having a double-stranded ribonuclease activity. Bioinformatics methods based on sequence alignment may fail when there is a low amino acidic identity percentage between query sequence and others with similar functions (remote homologues) or a similar sequence is not recorded in the database. Quantitative Structure-Activity Relationships (QSAR) applied to protein sequences may allow an alignment-independent prediction of protein function. These sequences QSAR like methods often use 1D sequence numerical parameters as the input to seek sequence-function relationships. However, previous 2D representation of sequences may uncover useful higher-order information. In the work described here we calculated for the first time the Spectral Moments of a Markov Matrix (MMM) associated with a 2D-HP-map of a protein sequence. We used MMMs values to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. We subsequently developed one MMM-QSAR and one classic Hidden Markov Model (HMM) based on the same data. The MMM-QSAR showed a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. We also report for the first time the isolation of a new Pac1 protein (DQ647826) from Schizosaccharomyces pombe, strain 428-4-1. The MMM-QSAR model predicts the new RNase III with the same accuracy as otherclassical alignment methods. Experimental assay of this protein confirms the predicted activity. The present results suggest that MMM-QSAR models may be used for protein function annotation avoiding sequence alignment with the same accuracy of classic HMM models.
Highlights
RNase III is a double-strand-specific ribonuclease that usually makes staggered cuts in both strands of a double helical RNA, in some cases it cleaves once in a single-stranded bulge in the helix 1, 2
The present approach involves the calculation of different sequence parameters based on Markov Model (MM), which can be applied to different kinds of molecular graphs 131 including DNA, RNA and proteins 93, 143
The methodology uses the Moments of a Markov Matrix (MMM) associated with a 2D sequence representation as the input for an Linear Discriminant Analysis (LDA) classifier
Summary
RNase III is a double-strand-specific ribonuclease (dsRNase) that usually makes staggered cuts in both strands of a double helical RNA, in some cases it cleaves once in a single-stranded bulge in the helix 1, 2. The primary biological function of this system is the specific processing of rRNA and mRNA precursors 3-5 but it has been implicated in other diverse phenomena such as mRNA turnover 6, conjugative DNA transfer 7, antisense RNA-mediated regulation and other 8, 9. Dicer and Drospha are type III RNases responsible for the generation of short interfering RNAs (siRNAs) from long double-stranded RNAs during RNA interference (RNAi). Each step has a crucial influence on the efficiency of RNAi.[10,11,12,13] It involves both RNase proteins in several important biological processes as for instance the function of Dicer on the vascular system regulating embryonic angiogenesis probably by processing miRNAs, which regulate the expression levels of some critical angiogenic regulators in the cell.[14]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.