Abstract

The protein folding problem continues to attract intense research interest and remains as one of the unsolved problems to be carried over into the next millennium. Recent advances in the prediction of protein structures1in conjunction with the burgeoning structural genomics effort,2 will in all likelihood bring a degree of closure to the protein structure prediction problem. However, there continues to be a significant division of opinion on the thermodynamic and kinetic aspects of folding, i.e., do proteins indeed fold to a “free energy minimum,” and if so, how do they find their way to unique predetermined folds? In addition to these traditional questions, increased attention is being focused on understanding protein folding in membranes and the linkage of protein misfolding to disease states. Studies on the mechanism of membrane protein folding and efforts in membrane protein structure prediction are likely to receive a boost as the number of available structures grow3and improved biochemical, thermodynamic, and genetic methods become available for characterizing protein interactions inside biological membranes. Active research on in vivo protein folding mechanisms promises to shed light on the fate of nascent polypeptide chains in a cellular environment. Transition states in protein folding. Daniel S. Rokhsar (University of California, Berkeley, CA). Local units of fold cooperativity and structure. A. Joshua Wand (University of Pennsylvania, Philadelphia, PA), S. Walter Englander (University of Pennsylvania, Philadelphia, PA), and Janet Thornton, University College London, UK). Free energy landscapes of folding proteins. Victor Muñoz (National Institutes of Health, Bethesda, MD) and Charles L. Brooks III (The Scripps Research Institute, La Jolla, CA). Electrostatic effects in determining the fold and stability of native and nonnative proteins. Bertrand Garcia-Moreno E (Johns Hopkins University, Baltimore, MD). Estimating free energies of binding. L. Mario Amzel (Johns Hopkins University School of Medicine, Baltimore, MD). The “Osmophobic” effect. D. Wayne Bolen (University of Texas Medical Branch, Galveston, TX). Protein fold taxonomy and evolution. Identifying structural neighbors. Eugene Koonin (NIH/National Center for Biotechnology Information, Bethesda, MD) and Stephen Bryant (NIH/National Center for Biotechnology Information, Bethesda, MD). Membrane protein folding: mechanisms, energetics and structure prediction. Stephen H. White (University of California, Irvine, CA), Karen G. Fleming (Yale University, New Haven, CT), James U. Bowie (University of California, Los Angeles, CA), Lukas K. Tamm (University of Virginia, Charlottesville, VA), and Thomas Woolf (Johns Hopkins University School of Medicine, Baltimore, MD). Protein misfolding in vivo and amyloidogenesis. Linda L. Randall (Washington State University, Pullman, WA), Anthony Fink (University of California, Santa Cruz, CA), Philip J. Thomas (University of Texas Southwestern Medical Center, Dallas, TX), and Jeffery W. Kelly (The Scripps Research Institute, La Jolla, CA). Anfinsen Memorial Lecture. John A. Schellman (University of Oregon, Eugene, OR). Daniel Rokhsar described the nature of transition states and transient intermediates observed in folding and unfolding simulations of simple model polymers on a lattice4 and in coarse-grained dynamics simulations of a β-hairpin. These simulations, carried out in collaboration with Vijay Pande, were used to select candidate configurations for the transition state ensemble. A parameter Pfold, the folding probability, is defined as the probability that a new simulation starting from a particular configuration will reach the native state. This is a useful way to characterize the transition state and transient as well as equilibrium intermediates in lattice model simulations because the native state is well defined. The parameter Pfold ≈ ½ implies that the likelihood of proceeding to folding from the chosen set of configurations is approximately ½. This set of configurations contributes to the transition state ensemble. Results from three types of lattice simulations based on the Go model, a Miyazawa-Jernigan contact potential, and an HPN heteropolymer were reported. The nature of the transition state varies between these three models, but in general possesses fairly well defined partial structure. Similar results were obtained from off-lattice dynamics simulations on a β-hairpin that has been experimentally studied by Eaton and coworkers.5 The major suggestion was that the transition state represents an “entropic bottleneck,” which is obtained by a folding protein to allow for a subsequent downhill slide into a free energy minimum that is either the folded state or a metastable state that may be a well-defined intermediate. This scenario for the transition state was referred to as “deferred gratification.” Furthermore, lattice model simulations indicate that folding pathways involve locally stable partial structures and cooperative transitions between these states. Dominique Bicout from National Institutes of Health presented a poster to address the issue of transition states. Using a simple model of diffusion in a sphere with the radius of the sphere as a reaction coordinate, he argued that barriers to folding are largely entropic, and nonexponential relaxation to the folded state would be feasible on predominantly downhill energy landscapes that negate entropic barriers. A. Joshua Wand discussed hydrogen exchange (HX) methods to probe global and subglobal structural cooperativity in proteins under the influence of different structural perturbations including chemical denaturation and hydrostatic pressure. HX measurements are a useful probe of native protein conformations under denaturing conditions. Results using the “structural molten globule” of apocytochrome b562 indicate three cooperative structural units, two of which correspond to the central helices of the four helix bundle of this protein. Remarkably, the identity of structural cooperativity signals are similar irrespective of the structural perturbation method used. Walter Englander discussed the use of native state HX experiments to explore conformationally excited states of cytochrome C. Native state hydrogen exchange kinetics can be related to the local, subglobal, or global modes of fluctuation of the protein because the native state itself does not contribute to native state HX. In the EX2 limit the rate of exchange nominally yields information at the individual residue level regarding the making and breaking of hydrogen bonds.6 Such measurements have helped identify a small number of subglobal structural units that open and close cooperatively. A particularly striking structural unit is the 60's helix of cytochrome C for which HX studies show local fluctuations. Englander suggested that built-in structural units in a protein determine a small number free energy wells on the landscape of a folding protein. Such structural units are likely to fold and unfold cooperatively absent any intermediates, and this could help explain two-state fast folding of some globular proteins. The “structural units of cooperativity” studied by Englander are in agreement with the observations of Joshua Wand and correspond to recognizable secondary structure. Superfolds7 are protein folds in the structural database that are highly populated. Why do some folds occur more often than others? Janet Thornton proposed three possible reasons for the occurrence of superfolds: (1) these folds have an evolutionary advantage; (2) they are thermodynamically favored because they have greater stability; or (3) superfolds have dominant local signals that make them fast folders. The latter hypothesis was investigated by an analysis of the frequency of occurrence of supersecondary structure motifs in common superfolds and an analysis of singlet folds. The three supersecondary structures used were β-hairpins, α-hairpins, and βαβ units. For superfolds that are not OB folds, doubly wound folds, jelly roll folds or TIM barrels, ∼70% of the protein folds have better than 50% of their supersecondary structure in the above three categories. Superfolds that are not well described by the three supersecondary structural elements have “more complex” interactions, such as “insertions” in TIM barrels, which create a divergence from local interactions. However, with the inclusion of two types of Greek key motifs, i.e., β4 or βαββ motifs, 74.8% of superfolds and 66.3% of nonsuperfolds have common supersecondary structure. A plausible scenario would be that supersecondary structures behave as folding units. This might explain their recurrence in superfolds. Taken together, the evidence from HX measurements and the information culled from an analysis of recurring units of supersecondary structure in superfolds may have implications for hierarchical models for protein folding.8 Can simple models be used to explain the order of the folding transition, i.e., two-state versus three-state based on free energy profiles constructed from an extensive enumeration of conformational states available to a protein. A simple approach to construct a coarse-grained partition function based on a two-state representation for conformations of individual amino acid residues was originally used in helix-coil transition theories.9 Victor Muñoz presented a two-state approximation to the conformations adopted by individual peptide bonds to reduce all possible three-dimensional conformations for a protein to conformational strings. For an N-residue chain the native conformation is a N − 1 letter conformational string nnn . . . nnn and the conformer with no native peptide bonds is the corresponding N − 1 letter conformational string ccc . . . ccccc. This type of model has been used in the past by Muñoz, Eaton, and coworkers to develop a statistical mechanical formalism for β-hairpin kinetics,10 in which thermodynamics and dynamics of a folding reaction were studied by calculating free energy changes that accompany transitions between individual conformers for the protein. Baker and coworkers11 recently demonstrated a correlation between contact order and the folding rates of single-domain proteins. Enthalpic contributions in Muñoz's formalism are, therefore, derived from a native contact map where two residues are said to be in contact if the interaction centers are within 4Å of each other. Each contact has a favorable interaction energy ϵ and the enthalpy for a given conformation is scaled to reproduce the free energy of unfolding. The number of nonnative conformers and the type of conformational transitions, i.e., α → β versus α → loop determines the conformational entropy lost on folding. The highlights of this approach are as follows: (1) free energy profiles for the folding transition can be constructed rather simply; (2) the partition function is accessible in the simplified two-state representation for available conformations to a folding protein; (3) the order of local and nonlocal contacts determines the enthalpic contribution to the free energy of folding and the folding rates; (4) The simple model applied to four single-domain proteins λ-repressor,12 muscle acyl phosphatase,13 chymotrypsin inhibitor 2,13 and cheY14 shows good agreement with experimentally observed folding rates and reproduces the order of the observed folding transitions; (5) in proteins where two-state fast folding is observed, the free energy profiles indicate the presence of a predominant pathway and the absence of well-populated intermediates. The transition state region is a collection of conformations that are typified by a loss of conformational entropy and a fluctuation in the number of native contacts. Fast folders are also characterized by early nucleation of secondary structure. In slow folding proteins the free energy profiles indicate the absence of a predominant pathway, i.e., all transitions to the native conformer are characterized by high free energy barriers. Certain intermediates could be characterized as being off-pathway or as traps. Results from this model suggest that it may be difficult to reconcile in favor of one of the two seemingly conflicting views on how proteins fold: the “old view”8, 15 or the “new view.”16, 17 Instead, protein folding pathways may vary from protein to protein depending on the balance of interactions. Free energy surfaces for folding proteins can also be derived from detailed all atom molecular dynamics simulations in explicit solvent. Charles Brooks III summarized simulation results for α-helical proteins including the three helix bundle protein A18 and an α/β protein the GB1 segment of streptococcal protein G.19 The two-step strategy involves generating a set of conformations by sampling at high temperatures to create a starting structural database. This database is used in subsequent umbrella or importance sampling for the free energy analysis part of the simulation. A crucial part in generating a free energy surface is the choice of an appropriate reaction coordinate to monitor folding. Brooks discussed the use of three plausible reaction coordinates: (1) the fraction of native contacts where the weight of each contact is determined from the amount of time a contact is present in the simulation used to ascertain the stability of the native state; (2) the radius of gyration Rg of the protein; and (3) the fraction of internal waters in the core of a folding protein. Details of the free energy analysis show different results for α-helical proteins compared to α/β proteins. Smaller helical proteins fold via a drastic reduction in size (Rg), stabilization of local segments of structure, and a gain in tertiary interactions. In the case of α/β proteins the initial phase is dominated by a collapse to form a hydrophobic core with the absence of any significant interactions. β-sheet formation is the rate-limiting step for the folding of α/β proteins, making the simulated folding times for such systems slower than the folding times for smaller helical proteins. The folding of the GB1 fragment of protein G studied by Brooks and coworkers also shows the presence of significant internal water after initial collapse. The simulations indicate that these waters are essential to the formation of interstrand hydrogen bond formation and act as “lubricants” to facilitate the formation of hydrogen bonds. The final stages of folding are characterized by squeezing out residual internal water. Brooks indicated that results from such detailed simulations may help rationalize the “funnel” view of the free energy surface for a folding protein.16, 17 A question that has not been resolved in protein electrostatics is the magnitude of the dielectric in the hydrophobic interiors of proteins. Continuum electrostatic models use values between ϵ = 2 and ϵ = 4 for the interior dielectric of proteins.20 These low values are based on the assumption of largely dehydrated protein cores. To understand the contribution of electrostatic effects to protein fold determination and stability, it is important to quantify the details of electrostatic interactions within protein interiors, a problem that is difficult to address experimentally and one that requires too many assumptions in theoretical work. Using staphylococcal nuclease as a test system, Bertrand Garcia-Moreno explained how various indirect experiments can be used to address the question of the interior dielectric21 and the mechanism of acid denaturation. The former question is studied by analyzing pKa shifts as a function of mutations to lysine within the interior of a protein which is then used to back calculate the effect on the interior dielectric. The use of a simple Tanford-Kirkwood cavity or Born approximation to rationalize pKa shifts leads to values for ϵ between 12 and 14, which is an order of magnitude larger than commonly accepted values. These values can be explained due to the presence of water molecules within the protein core. Also pKa values of surface Asp and Glu residues are considerably closer to model compound values than predicted by continuum models with a range of ϵ values. Furthermore, a simple modified Tanford-Kirkwood theory with values of ϵ between 12 and 14 gives better predictions for these pKa values compared with finite difference Poisson-Boltzmann calculations. This could imply that the effect of electrostatics on protein energetics is considerably weaker than originally imagined. By monitoring the number of bound protons as a function of increasing pH, the electrostatic effects on acid denaturation can be studied. General conclusions from Garcia-Moreno's talk were that the denaturation of staphylococcal nuclease cannot be due to a generalized charge effect, and there do not appear to be any significant electrostatic interactions in the denatured states of staphylococcal nuclease. Given a priori knowledge of the structure of a biomolecular complex, it would be useful to be able to predict free energies of binding and folding. The free energy of binding can be obtained directly from the binding affinity K according to: ΔG° = −RT ln K°. The superscript ° refers to standard state free energy differences.22 Mario Amzel discussed simple strategies to estimate the enthalpic and entropic contributions to the free energy of binding. A rigorous calculation of binding free energies would require the inclusion of explicit solvent. However, as Amzel argued, if the only contribution to binding is from bulk solvent, then calculations can be vastly simplified via empirical parameterizations to calculate the enthalpy and semiexplicit counting of states to estimate entropic contributions. The approach used for estimating contributions to the enthalpic term ΔH and changes in specific heat ΔCP are derived from a parameterization of available experimental and structural data for solvent accessible surface areas of protein-ligand complexes. Difficulties arise in calculating the entropy due to the loss of translational degrees of freedom, the loss of configurational entropy due to freezing main chain and side-chain degrees of freedom for amino acids and the entropy of hydration. Using specific examples, Amzel illustrated that the translational contribution to the entropy of binding is suitably determined using the “cratic” entropy proposed by Kauzmann rather than the Sackur-Tetrode equation used for ideal gases. The entropy of hydration and configurational entropies are derived with use of a combination of molecular mechanics calculations and Monte Carlo sampling. The latter is used to calculate the change the entropy of water due to local motions in restricted volumes.23, 24 Predicted values for the entropic terms were validated by comparisons with the experimental data of Tamura and Privalov.25 D. Wayne Bolen spoke about the thermodynamics of biological adaptation and introduced the concept of an “osmophobic effect” to explain the stability of proteins in osmolytes. This talk focused on the adaptation of organisms to environmental stresses by the production and concentration of osmolytes such as polyols including sucrose, trehalose, and sorbitol, amino acids such as glycine and proline, or particular methylamines such as trimethylamine oxide and sarcosine, which are all small organic molecules. These osmolytes confer increased stability on intracellular proteins, which are then able to withstand destabilizing influences of temperature, dehydration, or denaturants. A set of elegant experiments performed by Bolen and coworkers suggests the need to add the osmophobic effect to the list of fundamental forces or effects involved in protein folding.26 Although the hydrophobic effect arises because of unfavorable interactions between apolar side chains and water, the osmophobic effect arises because of unfavorable interactions between the peptide backbone and organic osmolytes. This was demonstrated in experiments that measure the Stokes radius of reduced carboxmethylated ribonuclease A (RCAM RNaseA) determined from the transfer of the random coil protein from water to 1 M concentrations of four naturally occurring osmolytes, trimethylamine N-oxide, sarcosine, sucrose, and proline.26 Experimental measurements for the transfer free energy demonstrate that the unfavorable interaction of polypeptide backbones with osmolytes is the dominant contribution to the stability of RCAM RNaseA in osmolytes. This is to be contrasted with the favorable interaction of the backbone with urea in the transfer of RCAM RNaseA from water to urea. The interaction of side chains with osmolytes are marginal. The overwhelming unfavorable interaction of the backbone compared with the indifferent interactions of side chains with osmolytes is referred to as “preferential exclusion” or simply as the osmophobic effect. Eugene Koonin described the use of sequence profile analysis methods to study the distribution of sequence homology-based fold predictions across fully sequenced genomes and phylogenetically conserved protein families.27 Improvements in protein fold prediction based on optimizations in the choice of starting structures for iterative profile searches were also presented. The observation is that the distribution of folds between bacteria and archea are similar, whereas the distribution of folds in eukaryotes is quite different.27 Even within bacteria there are differences in the distribution of folds between parasitic species compared with free-living species. Analysis of the distribution of the number of domains with different folds in a given protein is consistent with a “geometric model” which posits that multidomain proteins may have evolved by a random combination of domains. Koonin suggested that predictions, coupled with improved statistical analyses, can provide answers to questions about convergent versus a divergent evolution of protein folds. Steve Bryant gave a summary tour of the NCBI 3D structure database accessible at http://www.ncbi.nlm.nih.gov/Entrez/Structure.28 The database of structure neighbors VAST is used to identify homologs among lists of structural neighbors. This is done by defining a Homologous Core Structure (HCS) as the substructure known to be conserved in previously identified structural neighbors. The speaker argued in favor of using an HCS as the structural equivalent of a sequence motif, a strategy that may be useful in structure prediction. A session on membrane protein folding meeting was introduced for the first time at the Coolfont meeting. Stephen White presented several important results on membrane protein folding. These are: (1) The observation of time-averaged transbilayer structures of lipid functional groups, obtained using X-ray and neutron diffraction, which mandates the revision of an erroneous picture of membrane bilayers. Interfacial regions are ∼15Å wide on each side of the hydrophobic core and are chemically heterogeneous as opposed to being small featureless regions such as the hydrophobic lipid area. (2) Membrane interfaces promote the formation of secondary structure in polypeptides. This explains the folding and assembly of β-barrels such as α-hemolysin and amphipathic α-helical proteins such as melittin.29 (3) The preference of tryptophan for membrane interfaces has been measured with use of 1H magic angle spinning chemical shifts, 2D-NOESY 1H NMR and solid state 2H NMR to quantify the interactions of tryptophan analogues with phosphatidylcholine membranes. These studies indicate that the preference for tryptophan is most likely due to the balance of the hydrophobic effect, favorable electrostatic interactions in the headgroup region, and steric repulsion that excludes it from the hydrocarbon core.30 (4) Experimental determination of two different whole residue hydrophobicity scales to explain the energetic basis for the formation of secondary structure, insertion of elements of secondary structure into membranes, and the tertiary assembly of these elements. One scale, an interfacial scale, is useful for quantifying energetics of folding in membrane interfaces, and a second octanol scale describes the insertion of transmembrane helices into membranes. The peptide bond is as costly to partition into the interfacial region as charged residues, and the formation of hydrogen bonds reduces the cost of partitioning. Only helical bundles or β-barrels can pass through the membrane, indicating that spontaneous folding has to occur on the membrane interface.31 Karen Fleming reviewed the postulates of the two-stage model for the folding of helical membrane proteins proposed by Popot and Engelman.32 The focus of her talk was the second part of the two stage model, i.e., energetics of transmembrane helix-helix interactions. A model system, the transmembrane segment of human glycophorin A, a symmetric homodimer is used to study sequence specificity and energetics of transmembrane helix packing in a detergent micelle environment. Analytical ultracentrifugation is used to probe the energetics of transmembrane helix packing by comparing the sedimentation equilibrium of an associating system to the sedimentation equilibrium of a monodisperse system.33 Preliminary results on quantifying the mutational sensitivity of glycophorin A dimerization were reported. Lukas Tamm discussed intermediates in membrane protein folding in the context of β-barrel proteins of the outer membrane of Gram-negative bacteria, which can be unfolded in urea and GdnHCl and refolded in detergent micellar or lipid bilayer environments. Studies using OmpA the outer membrane protein of Escherichia coli were used to demonstrate the formation of secondary structures at membrane surfaces and the concomitant penetration of OmpA into the bilayer with all four β-hairpins intact.34 Two approaches to membrane protein structure prediction were presented at the meeting. James Bowie proposed a three-part approach based on the following: (1) using the available database of transmembrane helices to obtain packing classes in terms of helix packing angles and distances of closest approach35; (2) a reduced number of possible folds for sets of transmembrane helices using constraints imposed by membrane bilayers; and (3) the possible development of a scoring function to pick out viable topologies for the packing of transmembrane helices. Thomas Woolf proposed the use of a slightly different three-stage method based on a coarse-grained conformational search for possible orientations of transmembrane helices followed by a second filter to screen for low-energy topologies using a simplified CHARMM-like van der Waals potential and a united atom representation and, finally, a detailed all atom calculation either in an explicit membrane environment or a mean field lattice dipole representation for the membrane.36 Preliminary results using two of the three proposed steps applied to the prediction of the transmembrane helix homodimer of glycophorin A indicate the “potential functions” and methods for conformational search may in fact be reasonable for predicting transmembrane helix packing. The multistep conformational search methods of James Bowie and Thomas Woolf, together with global search37 and global optimization methods38 proposed in the literature, may be sufficient to address the problem of accurately predicting the packing of transmembrane helical proteins. A new session on protein folding diseases due to the formation of amyloid fibrils, the mechanism of misfolding, and chaperone interference with the folding of proteins in vivo was also a part of the meeting. Linda Randall discussed the role of SecB a cytosolic chaperone in recognizing nonnative protein conformations to facilitate localization of exported proteins.39 Binding must be rapid to compete with the folding of proteins within the cytosolic component. Anthony Fink40 spoke about studies to compare two different sequences of immunoglobulin light chain variable (IgG-VL) domains, one of which is amyloidogenic (SEM) and one that is not (LEN). The basic question addressed was: why do some proteins aggregate? Some of the factors include the concentration of protein, cosolutes, and concentrations of chaperones. Protein aggregates could be ordered or amorphous. Fibrils are characterized by their high β-sheet content, similar dimensions, typically braided structures, and insolubility though they can be dissolved by denaturants. Denaturation studies show that the amyloidogenic form is much less stable than the nonamyloidogenic form. Far and near ultraviolet circular dichroism measurements, along with FTIR and intrinsic tryptophan fluorescence measurements, indicate the presence of nearly native-like intermediates for the amyloidogenic form under mildly destabilizing conditions. The general conclusion was that amyloid fibril formation is due to the self-assembly of partially folded intermediates. Jeffery Kelly reported progress on the discovery of small molecules specifically phenols, biethyl ethers, and biphenyl amines that show high binding affinity to transthyretin to prevent conformational changes required for amyloid fibril formation.41 The strategy of looking for small molecules that may intervene in human amyloid diseases has led to the discovery of flufanamic acid, a nonsteroidal biphenylamine that inhibits amyloid fibril formation. Phillip Thomas discussed cystic fibrosis mutations due to membrane protein misfolding.42 Polypeptide models used to study the effect of mutations do not reveal an adverse effect of stability; instead, they influence the kinetics of populating alternate non-native structures. This year's Anfinsen memorial lecture, named after Christian Anfinsen who was a professor at Johns Hopkins University, was delivered by John Schellman. Anfinsen's pioneering work on protein folding has been the driving force for most of the work in this field over the past 25 years. In his introductory remarks Robert Baldwin reminded the community of the truly outstanding contributions made by John Schellman and the profound impact his work has had on our understanding of the thermodynamics of protein folding and stability. Schellman spoke about an “Errant History of Thermodynamics.” The goal of the lecture was to convey to the participants that they were not alone in their misconceptions about thermodynamics principles; rather, the history of thermodynamics is replete with the “detours, misunderstandings and polemics,” primarily due to human foibles. Throughout his engaging talk Schellman offered the kind of deep insights into statistical thermodynamics concepts that have been the hallmark of his enriching contributions to the field. The organizers for next year's meeting are L. Mario Amzel and Ernesto Freire. I thank Karen Fleming and Bertrand Garcia-Moreno for sharing slides of their presentations and Ed Lattman, Teresa Przytycka, George Rose, and David Shortle for careful reading and constructive criticism of this review.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call