Abstract
BackgroundMeiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). Hotspots and coldspots are receiving increasing attention in research into the mechanism of meiotic recombination. However, predicting hotspots and coldspots from DNA sequence information is still a challenging task.ResultsWe present a novel method for classification of hot and cold ORFs located in hotspots and coldspots respectively in Saccharomyces cerevisiae, using support vector machine (SVM), which relies on codon composition differences. This method has achieved a high classification accuracy of 85.0%. Since codon composition is a fusion of codon usage bias and amino acid composition signals, the ability of these two kinds of sequence attributes to discriminate hot ORFs from cold ORFs was also investigated separately. Our results indicate that neither codon usage bias nor amino acid composition taken separately performed as well as codon composition. Moreover, our SVM based method was applied to the full genome: We predicted the hot/cold ORFs from the yeast genome by using cutoffs of recombination rate. We found that the performance of our method for predicting cold ORFs is not as good as that for predicting hot ORFs. Besides, we also observed a considerable correlation between meiotic recombination rate and amino acid composition of certain residues, which probably reflects the structural and functional dissimilarity between the hot and cold groups.ConclusionWe have introduced a SVM-based novel method to discriminate hot ORFs from cold ones. Applying codon composition as sequence attributes, we have achieved a high classification accuracy, which suggests that codon composition has strong potential to be used as sequence attributes in the prediction of hot and cold ORFs.
Highlights
Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions and relatively low frequencies in others
We present a novel method for prediction of hot and cold ORFs located in hotspots and coldspots respectively in S. cerevisiae using support vector machine (SVM) based on codon composition differences
Our SVM based method was applied to the full genome: We predicted the hot/cold ORFs from all selected ORFs in yeast genome by using cutoffs of recombination rate and found that the performance of our method for predicting cold ORFs is not as good as that for predicting hot ORFs
Summary
Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). In Saccharomyces cerevisiae, meiotic recombination is initiated by double-strand DNA breaks (DSBs) [1,2]. Some genomic regions in which meiotic DSBs occur at relatively high frequencies are called hotspots, and by contraries, the regions with relatively low frequencies are (page number not for citation purposes). Observations concerning individual hotspots and coldspots have given clues as to the mechanism of recombination initiation, the prediction of hotspots and coldspots from DNA sequence information is very limited [2]. Several global mapping studies have been performed to map DSB sites on chromosomes in yeast to determine whether they share common DNA sequences and/or structural elements [2,3,4,5]. In yeast, there is a significant correlation between codon usage bias and recombination rate, and the similar phenomenon was observed in some other organisms, such as Drosophila melanogaster, mouse and human, which may be interpreted by biased genetic conversion during meiosis and/or Hill-Robertson interference [6,7,8,9,10,11]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.