Abstract

Abstract Summary UniProtKB is a publicly accessible database of annotated protein features for numerous organisms; however, globally extracting protein entry information for data visualization and categorization can be challenging. While the UniProtKB entry syntax maintains database consistency, it simultaneously obscures key terms within long character strings. To increase accessibility, UniProtExtractR is both an app and R package that extracts desired information across nine UniProtKB categories: DNA binding, Pathway, Transmembrane, Signal peptide, Protein families, Domain [FT], Motif, Involvement in disease, and Subcellular location [CC]. The app features interactive frequency tables that globally summarize both the original UniProtKB input query as well as the extracted/changed entry values. Moreover, UniProtExtractR includes a tractable mapping algorithm to define custom organelle-level resolution. UniProtExtractR exists as a freely accessible Shiny app that requires no coding experience as well as R package, the code of which is entirely open source. Availability and implementation UniProtExtractR source code and user manual, including example files and troubleshooting, is available at https://github.com/alex-bio/UniProtExtractR. The Shiny app is hosted at https://harperlab.connect.hms.harvard.edu/uniprotextractR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call