Abstract

Many studies have used position-specific scoring matrices (PSSM) profiles to characterize residues in protein structures and to predict a broad range of protein features. Moreover, PSSM profiles of Protein Data Bank (PDB) entries have been recalculated in many works for different purposes. Although the computational cost of calculating a single PSSM profile is affordable, many statistical studies or machine learning-based methods used thousands of profiles to achieve their goals, thereby leading to a substantial increase of the computational cost. In this work we present a new database compiling PSSM profiles for the proteins of the PDB. Currently, the database contains 333,532 protein chain profiles involving 123,135 different PDB entries.

Highlights

  • Position-specific scoring matrices (PSSMs) have been used in many works to compute and predict a broad range of protein features

  • In this work we present 3DCONS-DB, a database of PSSM profiles computed over protein sequences collected from the Protein Data Bank (PDB) [19]

  • Our analysis shows that non-domain regions seem functionally relevant and that the amount of information encoded in their PSSM profiles is around 80% of the information encoded in domain regions

Read more

Summary

Introduction

Position-specific scoring matrices (PSSMs) have been used in many works to compute and predict a broad range of protein features. PSSM profiles have been used to predict residue solvent accessibility [1], protein secondary structure [2], residue-residue contact maps [3], protein disordered regions [4], protein binding sites [5], protein-DNA interactions [6] or protein-protein interface hotspots [7]. These works used different prediction algorithms and methodologies, they share a common procedure that can be found in many other publications. A machine learning algorithm fed with PSSM profiles is trained to predict the selected feature over protein sequences or structures

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call