Abstract

Abstract Helitrons, eukaryotic transposable elements (TEs), were discovered 18 years ago in various genomes. In the Caenorhabditis elegans (C.elegans) genome, helitron sequences have high variability in terms of size by base pairs (bp) varied from 11 to 8965 bp from one sequence to another. These TEs are not uniformly dispersed sequences, and they have the ability to mobilize within a genome by a rolling-circle mechanism. This ability to move and reproduce in genomes enables these elements to play a major role in genomic evolution. In order to follow the evolution, we predicted helitron families (10 classes) in the C.elegans genome using the combination of the features extracted from signals corresponding to DNA sequences and the Support Vector Machine (SVM) classifier. In our classification system, the features extracted from the signals were shown to be efficient to automatically predict helitronic sequences. As a result, the Gaussian radial kernel over 100-fold cross-validation gave the best accuracy rates, ranging from 68% to 97%, with an overall mean score of 83.7%, and we successfully identified the Helitron Y1A class for a specific value of c and gamma, reaching an accuracy rate of 100%. In addition, other notable helitrons (NDNAX2, NDNAX3 Helitron_Y2) were predicted with interesting accuracy rates.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.