Abstract

With the rapid decreasing of sequencing cost, large volume of genotype data has been generated in many organisms based on high-throughput sequencing, which was utilized in various fields of biological studies in the post-genome era. The raw sequencing data were usually deposited in the NCBI SRA database. Construction of the database to store and analyze the processed genotype data is an essential step for the utilization of the genotype data by the community. Up to now, a comprehensive genotype database is still missing from maize, which is an important crop of the world. We report the construction of the MaizeSNPDB database using genotype data of 1210 maize line across 35,370,939 SNP sites refined from a large set of genomic variations reported by the maize HapMap 3 project. We further implemented several genetic analysis programs as graphical interfaces in the MaizeSNPDB database. SNPs in user-specified genomic regions could be easily extracted and analyzed in MaizeSNPDB. The whole dataset and code of MaizeSNPDB is available at https://github.com/venyao/MaizeSNPDB. MaizeSNPDB is deployed at http://150.109.59.144:3838/MaizeSNPDB/ for online use. The MaizeSNPDB database is of great value to future maize functional genomic studies, which can also facilitate marker-assisted breeding in maize.

Highlights

  • Genomic variation is an important force in evolution

  • The maize genome was split into 10-Mb non-overlapping genomic regions and genotype data at SNP sites in each genomic region was stored as an R data file, which could be efficiently loaded into the memory using the R programming language [13]

  • The dataset stored in the MaizeSNPDB database is the most comprehensive genomic variation dataset for maize to date

Read more

Summary

Introduction

Genomic variation is an important force in evolution. SNP is the predominant type of genomic variation and is utilized in various type of biological studies, including genome-wide association studies (GWAS), quantitative trait loci (QTL) mapping, genetic prediction, population genomics studies, marker-assisted breeding, etc. SNP data of more and more organisms were reported as the sequencing cost is getting lower with the rapid development of next-generation sequencing technology [2,3,4,5]. Efficient storage of SNPs in database with analysis tools has been proved to be helpful to the functional genomics studies of many organisms. Numerous databases had been constructed to store and analyze SNP data of various organisms [6,7,8]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.