Abstract

Protein schematics are valuable for research, teaching and knowledge communication. However, the tools used to automate the process are challenging. The purpose of the drawProteins package is to enable the generation of schematics of proteins in an automated fashion that can integrate with the Bioconductor/R suite of tools for bioinformatics and statistical analysis. Using UniProt accession numbers, the package uses the UniProt API to get the features of the protein from the UniProt database. The features are assembled into a data frame and visualized using adaptations of the ggplot2 package. Visualizations can be customised in many ways including adding additional protein features information from other data frames, altering colors and protein names and adding extra layers using other ggplot2 functions. This can be completed within a script that makes the workflow reproducible and sharable.

Highlights

  • Protein schematics are abundant in research papers, reviews, text books and on the internet[1]

  • For visualization on the internet, there is the BioJS solution, which can be used for proteins[3]

  • The focus on genomic data reduces the usefulness of drawing protein schematics, those illustrating multiple proteins and protein families

Read more

Summary

Introduction

Protein schematics are abundant in research papers, reviews, text books and on the internet[1]. For visualization on the internet, there is the BioJS solution, which can be used for proteins[3] Both of these tools are useful but not integrated into the Bioconductor workflow. The focus on genomic data reduces the usefulness of drawing protein schematics, those illustrating multiple proteins and protein families. For these reasons, a protein visualization package was produced using R to allow compatibility with the Bio-conductor suite of bioinformatics packages. Colors can be altered and protein names (labels) can be changed All of this can be done in a scripted manner that facilitates code sharing, visualization reproducibility and good practice in scientific computing[8]

Methods
Draw the chains of the proteins
Discussion
Corpas M
10. Lawrence T
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call