PyRice: a Python package for querying Oryza sativa databases.

Quan Do,Pierre Larmande,Ho Bich Hai

doi:10.1093/bioinformatics/btaa694

Abstract

Currently, gene information available for Oryza sativa species is located in various online heterogeneous data sources. Moreover, methods of access are also diverse, mostly web-based and sometimes query APIs, which might not always be straightforward for domain experts. The challenge is to collect information quickly from these applications and combine it logically, to facilitate scientific research. We developed a Python package named PyRice, a unified programing API to access all supported databases at the same time with consistent output. PyRice design is modular and implements a smart query system, which fits the computing resources to optimize the query speed. As a result, PyRice is easy to use and produces intuitive results. https://github.com/SouthGreenPlatform/PyRice. Supplementary data are available at Bioinformatics online.

Highlights

Rice, a model crop plant, is a major cereal grain widely consumed by a large part of the world’s human population, especially in Asia
Information of Oryza sativa genes are published on several open-access databases using different gene annotation models, e.g. RAPDB (Hiroaki et al, 2013), MSU7 or dedicated IDs (i.e. SNP-SEEK (Mansueto et al, 2016) and IC4R (IC4R Project Consortium et al, 2016)
PyRice manages a dictionary of ID mapping across databases since each uses either of the two systems RAPDB and MSU7 (e.g. LOC_Os01g01010 = Os01g0100100; while the first ID is from MSU7 and the second is from RAPDB)

Summary

Introduction

A model crop plant, is a major cereal grain widely consumed by a large part of the world’s human population, especially in Asia. Many digital resources have been developed in rice genomics. Compared to human genomics, not as much centralized resources and analysis tools are available for rice genomics. Most of the information currently available are scattered and patchy in nature. For scientists, the challenge lies in integrating data and finding useful information. In the scope of the project, we aim to build an API to solve the problem of collecting and managing gene and gene products information from different sources. The PyRice package is developed to run remote queries over ten databases and web applications so far. PyRice uses parallel processing to improve query speed. It indexes results for a fast search and supports exporting results into different formats

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jul 31, 2020
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

PyRice: a Python package for querying Oryza sativa databases.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

France is set to radically overhaul its drug regulatory system
S Arie
BMJ | VOL. 343
S ArieS Arie
03 Aug 2011
BMJ | VOL. 343

Updates to BioSamples database at European Bioinformatics Institute
Adam Faulconbridge ... Marco Brandizi
Nucleic Acids Research | VOL. 42
Adam Faulconbridge, et. al.Adam Faulconbridge ... Marco Brandizi
21 Nov 2013
Nucleic Acids Research | VOL. 42

Personalised Exploration Graphs on Semantic Data Lakes
Ada Bagozi ... Michele Melchiori
-
Ada Bagozi, et. al.Ada Bagozi ... Michele Melchiori
01 Jan 2019
01 Jan 2019

Predicting synthetic lethal interactions using heterogeneous data sources.
Herty Liany ... Vaibhav Rajan
Bioinformatics | VOL. 36
Herty Liany, et. al.Herty Liany ... Vaibhav Rajan
29 Nov 2019
Bioinformatics | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PyRice: a Python package for querying Oryza sativa databases.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics