Abstract

ProfileGrids allow the easy visualization of very large protein multiple sequence alignments (MSAs). Conserved and especially variable regions are represented as a matrix color‐coded according to the residue frequency occurring at each column position. While databases of protein families exist (such as Pfam), there are few curated repositories of user‐generated MSAs possibly due to the lack of paradigms for visualizing large MSAs. Here we present progress toward building a database of protein family ProfileGrids that we call the Nanoanatomy Museum. Our initial dataset was the pre‐calculated MSAs of the largest protein families from the Pfam database (ranging up to 160,000+ homologs). We describe our high‐throughput method for calculating ProfileGrids with the new JProfileGrid v2.0 software. It allows rapid and automated generation of ProfileGrids due to algorithm optimization, a command‐line interface, and a new PNG image file output format. The final database will be a proof of principle for how established databases can incorporate ProfileGrids in the standard description about protein families thus replacing other visualizations such as sequence logos.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.