Abstract

Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease.High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals.Comparisons between these species have provided unique insights into mammalian gene function. However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life.For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project.Only six of these have reference genomes:cattle, swine, sheep, goat, water buffalo, and bison.Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade.In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species' reference genome ( Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago ( Bos taurus UMD3.1).In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1,representing polymorphisms occurring in sheep.Of these, 41%could bereadily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous. These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene function.

Highlights

  • As the price per base for generation sequencing continues to fall, sequencing projects that are broad in scope become possible for research groups with modest budgets

  • The average read length and insert size was 100 and 500 bp, respectively with 95.4% of the reads meeting the Q20 quality score. These reads were mapped to the sheep and cattle reference genomes Oar3.1 and UMD3.1, respectively

  • In this work it has been demonstrated that whole genome shotgun sequencing (WGS) sequence data from one ruminant species could be mapped to a mature reference genome from another ruminants species diverged 15 to 30 million years ago for the purpose of identifying both inter, and intraspecies variation in highly conserved genomic regions

Read more

Summary

Introduction

As the price per base for generation sequencing continues to fall, sequencing projects that are broad in scope become possible for research groups with modest budgets. Research tools and approaches that once required large consortia[1,2,3], may be used by small groups of collaborators or even independent labs. High throughput technology has been democratized, formidable impediments remain that prohibit researchers whose work is not in human, model human, or agriculturally important species from realizing its benefits. Sequence data, once produced, is mapped to a reference genome for the species of the subject under investigation. Only cattle, swine, sheep, goat, water buffalo, and bison have annotated reference genomes. For the other species a reference genome has not been built, and will likely not be built for another decade or more

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call