Abstract

BackgroundIt is now well established that eukaryotic coding genes have the ability to produce more than one type of transcript thanks to the mechanisms of alternative splicing and alternative transcription. Because of the lack of gold standard real data on alternative splicing, simulated data constitute a good option for evaluating the accuracy and the efficiency of methods developed for splice-aware sequence analysis. However, existing sequence evolution simulation methods do not model alternative splicing, and so they can not be used to test spliced sequence analysis methods.ResultsWe propose a new method called SimSpliceEvol for simulating the evolution of sets of alternative transcripts along the branches of an input gene tree. In addition to traditional sequence evolution events, the simulation also includes gene exon-intron structure evolution events and alternative splicing events that modify the sets of transcripts produced from genes. SimSpliceEvol was implemented in Python. The source code is freely available at https://github.com/UdeS-CoBIUS/SimSpliceEvol.ConclusionsData generated using SimSpliceEvol are useful for testing spliced RNA sequence analysis methods such as methods for spliced alignment of cDNA and genomic sequences, multiple cDNA alignment, orthologous exons identification, splicing orthology inference, transcript phylogeny inference, which requires to know the real evolutionary relationships between the sequences.

Highlights

  • It is well established that eukaryotic coding genes have the ability to produce more than one type of transcript thanks to the mechanisms of alternative splicing and alternative transcription

  • SimSpliceEvol simulates events acting on the evolution of the exon-intron structure of genes and alternative splicing events acting on the sets of transcripts produced from genes, in addition to traditional sequence substitution and indel events

  • SimSpliceEvol takes as input a guide gene tree with branch lengths representing the number of substitutions per site on branches, and generates a set of gene sequences representing a gene family with the exon-intron structures and the sets of Complementary Deoxyribonucleic acid (DNA) (cDNA) sequences associated to alternative transcripts of the genes

Read more

Summary

Results

We consider additional criteria related to exon-intron structure and alternative splicing simulation. ISGv2, INDELible, Phylosim, and SimSpliceEvol allow the generation of the exon-intron structure of genes, but only the last two allow the generation of splice sites at the extremity of introns. Among the four methods that allow the generation of gene exon-intron structure, only SimSpliceEvol allows evolving this structure and the resulting set of alternative transcripts. It does not include an evolution model for splice sites, as Phylosim does. For the current first version of the method, we chose to focus on the development of models for the evolution of the exon-intron structure and the set of alternative transcripts.

Conclusions
Background
Materials and methods
XXXXX Xa X X
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call