Abstract
BackgroundThere is considerable interest in developing high-throughput genotyping with single nucleotide polymorphisms (SNPs) for the identification of genes affecting important ecological or economical traits. SNPs are evenly distributed throughout the genome and are likely to be functionally relevant. In rainbow trout, in silico screening of EST databases represents an attractive approach for de novo SNP identification. Nevertheless, EST sequencing errors and assembly of EST paralogous sequences can lead to the identification of false positive SNPs which renders the reliability of EST-derived SNPs relatively low. Further validation of EST-derived SNPs is therefore required. The objective of this work was to assess the quality of and to validate a large number of rainbow trout EST-derived SNPs.ResultsA panel of 1,152 EST-derived SNPs was selected from the INRA Sigenae SNP database and was genotyped in standard and double haploid individuals from several populations using the Illumina GoldenGate BeadXpress assay. High-quality genotyping data were obtained for 958 SNPs representing a genotyping success rate of 83.2 %, out of which, 350 SNPs (36.5 %) were polymorphic in at least one population and were designated as true SNPs. They also proved to be a potential tool to investigate genetic diversity of the species, as the set of SNP successfully sorted individuals into three main groups using STRUCTURE software. Functional annotations revealed 28 non-synonymous SNPs, out of which four substitutions were predicted to affect protein functions. A subset of 223 true SNPs were polymorphic in the two INRA mapping reference families and were integrated into the INRA microsatellite-based linkage map.ConclusionsOur results represent the first study of EST-derived SNPs validation in rainbow trout, a species whose genome sequences is not yet available. We designed several specific filters in order to improve the genotyping yield. Nevertheless, our selection criteria should be further improved in order to reduce the observed high rate of false positive SNPs which results from the occurrence of whole genome duplications.
Highlights
There is considerable interest in developing high-throughput genotyping with single nucleotide polymorphisms (SNPs) for the identification of genes affecting important ecological or economical traits
Several selection filters (Figure 1) were applied in order to select a panel of 1,152 expressed sequence tags (ESTs)-derived SNPs for validation: (1) in order to meet the requirements for probe design constraints for the Illumina genotyping platform, all SNPs with less than 60 nucleotides between two neighbouring SNPs and with flanking sequences less than 100 nucleotides long were removed; (2) in order to overcome problems due to exon-intron junctions, the SNP flanking sequences were aligned against rainbow trout BAC-end sequences [27] using megablast tools and against zebrafish, medaka, and stickleback genomic sequences using blastn tools
Sigenae SNP database characterization and selection of a subset for validation The Sigenae rainbow trout EST-derived SNP database contains 31,121 putative SNPs identified in 13,374 EST contigs (Table 1)
Summary
There is considerable interest in developing high-throughput genotyping with single nucleotide polymorphisms (SNPs) for the identification of genes affecting important ecological or economical traits. The objective of this work was to assess the quality of and to validate a large number of rainbow trout EST-derived SNPs. International genome initiatives have resulted in draft sequences of the genome of several farm animals (cattle, pig, chicken, and horse) and of model fish species (zebrafish (Danio rerio), medaka (Oryzias latipes), stickleback (Gasterosteus aculeatus), takifugu (Takifugu rubripes), and tetraodon (Tetraodon nigroviridis)). SNPs in gene coding sequences can be either synonymous (silent polymorphism) or non-synonymous (replacement polymorphism). They are of particular interest to study the genetics of expressed genes and to map functional traits. Non-synonymous SNPs can potentially have deleterious functional effects because they lead to changes in amino acid sequences and possibly affect protein structure and function [2,3]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.