Abstract
Many studies have used position-specific scoring matrices (PSSM) profiles to characterize residues in protein structures and to predict a broad range of protein features. Moreover, PSSM profiles of Protein Data Bank (PDB) entries have been recalculated in many works for different purposes. Although the computational cost of calculating a single PSSM profile is affordable, many statistical studies or machine learning-based methods used thousands of profiles to achieve their goals, thereby leading to a substantial increase of the computational cost. In this work we present a new database compiling PSSM profiles for the proteins of the PDB. Currently, the database contains 333,532 protein chain profiles involving 123,135 different PDB entries.
Highlights
Position-specific scoring matrices (PSSMs) have been used in many works to compute and predict a broad range of protein features
In this work we present 3DCONS-DB, a database of PSSM profiles computed over protein sequences collected from the Protein Data Bank (PDB) [19]
Our analysis shows that non-domain regions seem functionally relevant and that the amount of information encoded in their PSSM profiles is around 80% of the information encoded in domain regions
Summary
Position-specific scoring matrices (PSSMs) have been used in many works to compute and predict a broad range of protein features. PSSM profiles have been used to predict residue solvent accessibility [1], protein secondary structure [2], residue-residue contact maps [3], protein disordered regions [4], protein binding sites [5], protein-DNA interactions [6] or protein-protein interface hotspots [7]. These works used different prediction algorithms and methodologies, they share a common procedure that can be found in many other publications. A machine learning algorithm fed with PSSM profiles is trained to predict the selected feature over protein sequences or structures
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.