Sequence-Based Prediction of Metamorphic Behavior in Proteins

Nanhao Chen,Madhurima Das,Andy Liwang,Lee-Ping Wang

doi:10.1016/j.bpj.2020.07.034

Abstract

An increasing number of proteins have been demonstrated in recent years to adopt multiple three-dimensional folds with different functions. These metamorphic proteins are characterized by having two or more folds with significant differences in their secondary structure, in which each fold is stabilized by a distinct local environment. So far, ∼90 metamorphic proteins have been identified in the Protein Databank, but we and others hypothesize that a far greater number of metamorphic proteins remain undiscovered. In this work, we introduce a computational model to predict metamorphic behavior in proteins using only knowledge of the sequence. In this model, secondary structure prediction programs are used to calculate diversity indices, which are measures of uncertainty in predicted secondary structure at each position in the sequence; these are then used to assign protein sequences as likely to be metamorphic versus monomorphic (i.e., having just one fold). We constructed a reference data set to train our classification method, which includes a novel compilation of 136 likely monomorphic proteins and a set of 201 metamorphic protein structures taken from the literature. Our model is able to classify proteins as metamorphic versus monomorphic with a Matthews correlation coefficient of ∼0.36 and true positive/true negative rates of ∼65%/80%, suggesting that it is possible to predict metamorphic behavior in proteins using only sequence information.

Highlights

Introduction1380 Biophysical Journal 119, 1380–1390, October 6, 2020 native conditions
Christian Anfinsen was awarded a Nobel Prize in Chemistry in 1972 for his work on the apparent one-to-one relationship between the amino acid sequence of a protein and its threedimensional fold [1,2], giving rise to the classic paradigm: ‘‘one sequence, one fold.’’ serendipitous discoveries in the past few decades have led to the identification of ‘‘metamorphic proteins’’ [3,4] that have the ability to jump reversibly between two distinctly different folds under1380 Biophysical Journal 119, 1380–1390, October 6, 2020 native conditions
We found a robust performance of the diversity index-based classifier with a Matthews correlation coefficient (MCC) of 0.355 that is largely insensitive to changes in the parameterization and training data set

Summary

Introduction

1380 Biophysical Journal 119, 1380–1390, October 6, 2020 native conditions. These proteins are fundamentally different from intrinsically disordered proteins [5], morpheeins [6], and moonlighting proteins [7,8], which have been studied for a long time. Typical conformational changes in proteins often involve ‘‘shearing’’ or ‘‘hinge’’ behavior in which entire protein subunits or secondary structure elements undergo relative motions without significantly altering the fold of the protein [9,10]. The different folds/structures of a metamorphic protein are dissimilar on a more fundamental level, often involving changes such as the transformation of a whole a-helix into b-strands (Fig. 1). We use significant change in secondary structure as the key defining characteristic of metamorphic proteins.

Methods

Results

Conclusion