FRAGS: estimation of coding sequence substitution rates from fragmentary data

Estienne C Swart,Winston A Hide,Cathal Seoighe

doi:10.1186/1471-2105-5-8

Abstract

BackgroundRates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence of pairs of organisms. However the use of different data sources, alignment protocols and methods to estimate substitution rates leads to widely varying estimates of key parameters that define the coding sequence divergence of orthologous genes. Although complete genome sequence data are not available for all organisms, fragmentary sequence data can provide accurate estimates of substitution rates provided that an appropriate and consistent methodology is used and that differences in the estimates obtainable from different data sources are taken into account.ResultsWe have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were also used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed.ConclusionWe have developed a system that performs robust estimation of substitution rates for orthologous sequences from a pair of organisms. Our system can be used when fragmentary genomic or transcript data is available from one of the organisms and the other is a completely sequenced genome within the Ensembl database. As well as estimating substitution statistics our system enables the user to manage and query alignment and substitution data.

Highlights

Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest
A significant excess in the rate of non-synonymous substitution (Ka) compared to the rate of nearly neutral synonymous substitution (Ks) is widely used as evidence that a sequence has evolved under positive selective pressure [1]
In the case of genes not evolving under positive selection the relative rates of non-synonymous to synonymous substitutions

Summary

Results

We have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed

Conclusion

Background

Results and Discussion

Conclusions

37. Yang Z

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jan 1, 2004
Citations: 36	License type: cc-by

R Discovery Prime

R Discovery Prime

FRAGS: estimation of coding sequence substitution rates from fragmentary data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Composite Likelihood Modeling of Neighboring Site Correlations of DNA Sequence Substitution Rates
Ling Deng ... Dirk F Moore
Statistical Applications in Genetics and Molecular Biology | VOL. 8
Ling Deng, et. al.Ling Deng ... Dirk F Moore
28 Jan 2009
Statistical Applications in Genetics and Molecular Biology | VOL. 8

The influence of secondary structure, selection and recombination on rubella virus nucleotide substitution rate estimates.
Leendert J Cloete ... Brejnev M Muhire
Virology Journal | VOL. 11
Leendert J Cloete, et. al.Leendert J Cloete ... Brejnev M Muhire
16 Sep 2014
Virology Journal | VOL. 11

Variation in DNA substitution rates among lineages erroneously inferred from simulated clock-like data.
Rachel S Schwartz ... Corrie S Moreau
PloS one | VOL. 5
Rachel S Schwartz, et. al.Rachel S Schwartz ... Corrie S Moreau
11 Mar 2010
PloS one | VOL. 5

Pseudo-Reverse Approach in Genetic Evolution
Sukanya Manna ... Cheng-Yuan Liou
-
Sukanya Manna, et. al.Sukanya Manna ... Cheng-Yuan Liou
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FRAGS: estimation of coding sequence substitution rates from fragmentary data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics