Abstract

pyGeno is a python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.

Highlights

  • High-throughput systems biology and precision medicine applications require the integration of data from many different sources

  • A significant part of precision medicine research revolves around the identification of relevant single nucleotide polymorphisms (SNPs) and insertions/deletions (INDELS) and the study of their context[1]

  • Recent studies in proteogenomics show that replacing traditional reference databases such as Uniprot[2] by customized databases that integrate the subject’s genomic polymorphisms, can significantly improve the identification of peptides or proteins using mass spectrometry[3,4,5,6]

Read more

Summary

Introduction

High-throughput systems biology and precision medicine applications require the integration of data from many different sources. Recent studies in proteogenomics show that replacing traditional reference databases such as Uniprot[2] by customized databases that integrate the subject’s genomic polymorphisms, can significantly improve the identification of peptides or proteins using mass spectrometry[3,4,5,6]. These applications usually require the integration of reference sequences, reference genome annotations, specific SNPs and INDELs along with an external SNP database such as dbSNP7 for validation. More advanced users can rely on object-oriented inheritance to extend the functionalities of pyGeno to implement support for polymorphisms from other sources. pyGeno has been used with human and mouse genomes and should readily work with any diploid organism whose annotations are made available by Ensembl

Methods
License
Uniprot Consortium

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.