Abstract

BackgroundRice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non−computer savvy rice researchers.FindingsThe Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice−bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented.ConclusionsRice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science.

Highlights

  • Rice molecular genetics, breeding, genetic diversity, and allied research have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery

  • Analysis of such datasets is a challenge to rice researchers owing to (i) the fairly large data matrix and the compute-intensive algorithms that require specialized computing infrastructure, and (ii) the relative difficulty in using open source/free software tools for analysis, which are commonly provided without graphical user interface and require proper installation in a Linux operating system environment

  • There are other freely available web-based bioinformatics and breeding informatics software tools, optimized for plant species other than rice, including Araport [13] for Arabidopsis, Cassavabase [14] for cassava, and The Triticeae Toolbox (T3 [15]) for wheat and barley. While these tools are very useful, they are species/crop-specific and custom-built for the specialized requirements of their respective communities, making adoption in rice challenging for ≥2 reasons: (i) the need to produce curated rice datasets that work seamlessly with the software system, and (ii) the need for a dedicated software development team to customize the application for rice-specific data and analyses

Read more

Summary

Background

With the decreasing cost of genome sequencing, rice molecular geneticists, breeders, and diversity researchers are increasingly adopting genotyping technologies as routine components in their workflows, generating large datasets of genotyping and genome sequence information. There are other freely available web-based bioinformatics and breeding informatics software tools, optimized for plant species other than rice, including Araport [13] for Arabidopsis, Cassavabase [14] for cassava, and The Triticeae Toolbox (T3 [15]) for wheat and barley While these tools are very useful, they are species/crop-specific and custom-built for the specialized requirements of their respective communities (such as project datasets), making adoption in rice challenging for ≥2 reasons: (i) the need to produce curated rice datasets that work seamlessly with the software system (e.g., genome-browser−ready data, curated genes, published quantitative trait loci from bi-parental crosses and GWAS and markers associated to traits), and (ii) the need for a dedicated software development team to customize the application for rice-specific data and analyses. R Rice Galaxy code and built-in data for local/institutional deployment [20]

Discussion
Conclusion
Availability of source code and requirements
Availability of supporting data and materials
60. Research Data Alliance PID Kernel Information Working
64. AWS CLI Command Reference
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call