Abstract

BackgroundComplete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST) and genome sequence data.ResultsTaking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR) primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family.ConclusionWe provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci with respect to marker development in legume species with no sequence information available, demonstrating the validity of this approach.

Highlights

  • Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants

  • The algorithm employed by the CATS proposing pipeline is best illustrated as a series of consecutive comparative selection filters followed by automated primer-design using PriFi (Figure 1)

  • Parameters that we considered include the amount of Expressed Sequence Tags (EST) information and their phylogenetic relationship

Read more

Summary

Introduction

Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. Comparing genomic sequences of genetic models such as Arabidopsis [4] and rice [5,6], with large collections of ESTs from related plants, enables the identification of shared loci instrumental in projecting the large and repetitive genomes of many crop species onto the genomes of the model species. Members of a duplicated gene pair are retained or deleted at random in the two duplicated regions, obscuring their common past This process results in diminished congruency between two genomes that are separated by a polyploidization-diploidization cycle. In order to avoid the pitfalls of comparative genome mapping, the species to be compared should be carefully chosen

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.