Abstract
The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in U.S. beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( EPAS1), a gene associated with high-altitude pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses.
Highlights
The number of disease-causing mutations discovered in bovine genes has increased with the advancement of next-generation sequencing, reference genomes, and high density single nucleotide polymorphisms (SNPs) arrays
Archival DNA was used from extracts of samples that were either: purchased from commercial sources that collected them for artificial insemination of cattle and not for research, purchased from individuals that collected them privately for their purposes, or donated to the U.S Meat Animal Research Center (USMARC) by private individuals that collected them privately for their own purposes
Despite the modest power within breed, sequencing the entire panel significantly increased the chances of detecting relatively rare variants segregating in U.S beef cattle
Summary
The number of disease-causing mutations discovered in bovine genes has increased with the advancement of next-generation sequencing, reference genomes, and high density single nucleotide polymorphisms (SNPs) arrays. There are currently 114 Mendelian traits with known causal mutations in 117 cattle genes[1] The majority of these mutations cause disease by changing the gene’s protein sequence and thereby altering its normal function. In 2006 when a K211 mutation was discovered in the prion gene of a U.S cow that developed atypical BSE, there was a pressing need to search for this potentially pathogenic allele in other U.S cattle[2,3]. In principle, these searches can be accomplished in silico with access to population-scale gene sequence data. I speak from experience in designing diagnostic probes for the IDB cattle chip used in Ireland
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.