Abstract

BackgroundRelatively little information is available for sequence variation in the pig. We previously used a combination of short read (25 base pair) high-throughput sequencing and reduced genomic representation to discover > 60,000 single nucleotide polymorphisms (SNP) in cattle, but the current lack of complete genome sequence limits this approach in swine. Longer-read pyrosequencing-based technologies have the potential to overcome this limitation by providing sufficient flanking sequence information for assay design. Swine SNP were discovered in the present study using a reduced representation of 450 base pair (bp) porcine genomic fragments (approximately 4% of the swine genome) prepared from a pool of 26 animals relevant to current pork production, and a GS-FLX instrument producing 240 bp reads.ResultsApproximately 5 million sequence reads were collected and assembled into contigs having an overall observed depth of 7.65-fold coverage. The approximate minor allele frequency was estimated from the number of observations of the alternate alleles. The average coverage at the SNPs was 12.6-fold. This approach identified 115,572 SNPs in 47,830 contigs. Comparison to partial swine genome draft sequence indicated 49,879 SNP (43%) and 22,045 contigs (46%) mapped to a position on a sequenced pig chromosome and the distribution was essentially random. A sample of 176 putative SNPs was examined and 168 (95.5%) were confirmed to have segregating alleles; the correlation of the observed minor allele frequency (MAF) to that predicted from the sequence data was 0.58.ConclusionThe process was an efficient means to identify a large number of porcine SNP having high validation rate to be used in an ongoing international collaboration to produce a highly parallel genotyping assay for swine. By using a conservative approach, a robust group of SNPs were detected with greater confidence and relatively high MAF that should be suitable for genotyping in a wide variety of commercial populations.

Highlights

  • Little information is available for sequence variation in the pig

  • While many single nucleotide polymorphisms (SNP) will surely be discovered by genome sequencing, because of low sequence coverage the conversion rate of putative SNP and their minor allele frequency won't be known until tested across populations

  • Reduced Representation Library Selection Six enzymes were screened for suitability for representation library (RRL) construction, with the goal of minimizing repetitive content in the target size range of 450 bp

Read more

Summary

Introduction

We previously used a combination of short read (25 base pair) high-throughput sequencing and reduced genomic representation to discover > 60,000 single nucleotide polymorphisms (SNP) in cattle, but the current lack of complete genome sequence limits this approach in swine. The identification of genes and mutations that lead to genetic variation in complex, economically important traits in livestock has been hindered by the lack of genomic sequence, adequate map density and effective platforms for high density genotyping. The availability of livestock genome sequence, a high density of markers, and cost effective SNP genotyping will allow genome-wide association studies in swine. In order to identify large numbers of randomly distributed SNPs for swine, we chose to construct a reduced representation library (RRL) to reduce the complexity of the genome and to use massively parallel second-generation sequencing to identify large numbers of high-confidence SNP for high density genotyping on a cost effective platform

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call