Graph-representation of oxidative folding pathways
© Ágoston et al; licensee BioMed Central Ltd. 2005
Received: 20 October 2004
Accepted: 27 January 2005
Published: 27 January 2005
The process of oxidative folding combines the formation of native disulfide bond with conformational folding resulting in the native three-dimensional fold. Oxidative folding pathways can be described in terms of disulfide intermediate species (DIS) which can also be isolated and characterized. Each DIS corresponds to a family of folding states (conformations) that the given DIS can adopt in three dimensions.
The oxidative folding space can be represented as a network of DIS states interconnected by disulfide interchange reactions that can either create/abolish or rearrange disulfide bridges. We propose a simple 3D representation wherein the states having the same number of disulfide bridges are placed on separate planes. In this representation, the shuffling transitions are within the planes, and the redox edges connect adjacent planes. In a number of experimentally studied cases (bovine pancreatic trypsin inhibitor, insulin-like growth factor and epidermal growth factor), the observed intermediates appear as part of contiguous oxidative folding pathways.
Such networks can be used to visualize folding pathways in terms of the experimentally observed intermediates. A simple visualization template written for the Tulip package http://www.tulip-software.org/ can be obtained from V.A.
The process of protein folding whereby a linear polypeptide chain reaches its native structure has been one of the most intensely studied biomolecular problems over the past 50 years (for current reviews see [1–3]). Folding of a protein is usually pictured as a search for the native conformation within the conformational space of all possible conformational states, each characterized by a set of parameters. Even though most of the conformational states are not accessible to experiment, graphic representations of the potential energy surface have played pivotal roles in explaining how the conformational space is gradually restricted during the process folding . Key concepts such as folding pathways  are also best explained by graphic representations.
The underlying chemical mechanism is disulfide interchange (Figure 1B). In this scheme there are two kinds of reactions: i) in a redox reaction a protein disulfide bond is created (or abolished), i.e. the oxidative state of the polypeptide is changed. This is the case when one of the participants of the reaction (say RSH) is not part of the protein. ii) In a shuffling reaction both participants of the reaction are protein-bound, so the oxidative state of the polypeptide does not change. In view of these possibilities it becomes obvious that there are a great many ways in which disulfide bridges can form and rearrange during the folding process. Today it is generally accepted that non-covalent interactions guide the process of folding and formation of disulfide bridges will lock the protein into the right conformation. The advantage of oxidative folding as opposed to general protein folding is that disulfide intermediates can be chemically isolated and studied using such techniques as acid trapping of the intermediates and analysis of the disulfide bridges using a combination of enzymatic cleavage and mass spectrometry. There is a body of literature in describing the pathways of oxidative folding in terms of disulfide intermediates [6–8], and our goal is show how graph theory can be used to visualize them.
First, protein structure itself can be considered as a graph consisting of various interactions (such as covalent bonds, hydrogen bonds, spatial vicinities, contacts etc.) as edges, the nodes being atoms or residues of the protein. For instance, one of the classical definitions of protein secondary structures is based on main/chain H-bond contacts between residues . Structural neworks have been used in folding research as well. It was found, among others, that the so-called contact order, i.e. the average sequence distance between residues in atomic contact, seems to be a key determinant of folding speed . Another line of research concentrates on characteristic networks of inter-atomic contacts that may form stabilization centres in protein structures and can be the reason of the differential stability of various proteins [13, 14]. It was found that populated conformations seen in molecular dynamics simulations contain characteristic networks of residues [15, 16].
ii) In the network descriptions of the folding space, on the other hand, the folding states are the nodes, and transitions are the edges between them. This approach was fostered by the finding that the robustness and stability of networks may be the result of simple topological properties that are invariant throughout various technical as well as biological systems . In the following years the network topology of a large number of systems have been described, and it was found that some topology classes, such as those characterized by a scale-free distribution of the number of links at each node, or the so called "small world models" that are characterized by densely connected subnetworks loosely linked between each other, are indeed found in various systems within and without biology (for reviews see ). The various network types were described in terms of simple measures borrowed from graph theory, such as the clustering coefficient, the diameter of the graph etc . Scala and associates described the folding states of short peptides using Monte Carlo simulation on lattice models . They found that that the geometric properties of this network are similar to those of small-world networks, i.e. the diameter of the conformation space increases for large networks as the logarithm of the number of conformations, while locally the network appears to have low dimensionality. Shakhnovich and co-workers analysed the folding states of proteins during molecular dynamics simulations. It was found that the folding space is reminiscent of scale-free network, characterized by a majority of less populated states as well as some highly populated states reminiscent of "hubs" seen in other systems .
Our purpose is to describe the folding space of the oxidative folding process using graph representations. This is an intriguing task since, in contrast to "ordinary" protein folding, the number of states defined in terms of disulfide links is not exceedingly high, moreover the actual disulfide intermediates can be isolated and studied. We will approach the problem in two steps as: i) We will use graph theory to describe the disulfide intermediates, and to enumerate the states of the folding space. ii) Then we will represent the folding space as a network (graph) of all possible intermediates. We show with few examples that experimentally observed intermediates mapped onto this network appear as contiguous folding pathways.
Results and discussion
Graph representation of oxidative folding intermediates
The number of fully oxidised (disulfide bonded) isomers in a protein chain with n disulfide bonds (2n cysteines) can be deduced from simple combinatorial considerations as (2n)!/(n!*2 n ). According to this formula proteins with two disulfide bridges have 3 fully oxidized isomers, 3-disulfide proteins have 15 and 4-disulfide proteins have 105. In other words, the number of intermediates increases very fast as a function of the number of constituent cysteines.
Description of the oxidative folding space as graphs
The transitions between folding intermediates can be conveniently described by comparing the adjacency matrices of the two states. For the enumeration of the transition reactions we introduce a few simple variables. NB is the number of disulfide bonds, calculated as the sum of the elements of the adjacency matrix.
The sum of elements in the i-th column plus the i-th row,
is 1 if the i-th cysteine is part of a disulfide bridge and zero otherwise. The sum of the differences calculated between the S i measures of two adjacency matrices,
shows how many cysteines gain or loose a bond as the molecule passes from one state to the other. Here we are interested only in the two kinds of elementary reactions depicted in Figure 1B. In shuffling reactions, the number of disulfide bridges NB remains the same by definition, and it is easy to show that SD will differ exactly by 2. In redox steps in which one disulfide bridge is established or lost, NB and SD will increase or decrease by one and two, respectively.
On the above basis one can easily draw a network of all possible oxidative folding pathways. For a protein of n cysteines, we first generate the graphs (adjacency matrices) of all possible intermediates, i.e. those with 0,1...(i ≤ n/2) disulfide bridges. Then we compare all pairs of intermediates in terms of the above parameters. A shuffling edge will be drawn between two intermediate states if |SD| = 2 and ΔNB = 0; redox edges will be drawn if |SD| = 2 and ΔNB = 1. If the values of |SD| and ΔNB are different from these two combinations, no edge will be drawn.
Parameters of oxidative folding networks*
N of cysteines
N of intermediates (nodes)
Total no of transitions (edges)
Clustering coefficient C
Average path length
Panel B shows a peptide with an odd number cysteines, such as granulocyte-colony stimulating-factor [23, 24] in which the native state contains one free cysteine residue. In this case there are shuffling edges even in the lowest plane in the figure, so the native state (one of the states in the lowest plane) can be subject to shuffling transitions. On the contrary, if the number of cysteines is an even number (i.e. in the majority of known proteins), the fully oxidized DISs can not interconvert into each other in one step. In some cases however an additional cysteine is in fact used to facilitate the process of oxidative folding: the propeptide of BPTI contains an additional free cysteine that seems to significantly speed up the in vitro folding of the molecule . In vivo, the propeptide is subsequently cleaved, and in this way the structure is locked into the native disulfide configuration.
Disulfied intermediates experimentally observed in the oxidative folding of various proteins
Bovine pancreatic trypsin inhibitor (BPTI)
3–5; 1–6; 3–5, 1–2;
3–5, 1–4; 3–5, 2–4;
1–6, 2–4; 3–5, 1–6;
1–6, 3–5, 2–4;
Insulin-like growth factor (IGF)
2–6; 2–6, 3–5; 2–6, 1–4;
2–6, 4–5; 2–6, 1–3;
2–6, 1–3, 4–5; 1–4, 2–6, 3–5;
Epidermal growth factor (EGF)
2–3; 1–2; 4–6; 5–6;
3–4; 2–4, 5–6; 2–5, 3–4;
1–6, 2–5, 3–4; 1–2, 3–4, 5–6;
1–3, 2–4, 5–6;
The oxidative folding space of polypeptides can be represented as networks in which the nodes are the disulfide intermediates while the edges are transitions between them. A simple visualization tool written was developed to draw 3D pictures of such networks in which the states having the same number of disulfide bridges are placed on separate planes. These pictures provide a simple method for the visualization of oxidative folding pathways as studied by experimental methods. In the case of bovine pancreatic trypsin inhibitor, insulin-like growth factor and epidermal growth factor, the folding pathways appear as a small network of contiguous routes that connect the fully reduced state to the native state. A further plausible extension of this method would include colouring of the folding states by quantitative properties and look for correlations between the coloured areas of the network and the experimentally determined folding pathways.
Even though the topology of the theoretically complete folding space appears to be highly regular, data currently available are insufficient to draw general conclusions on the topology of the experimentally observed folding pathways. Experimentalists find folding intermediates as a series of chromatographic peaks, and usually the disulfide bridges of more abundant species are analysed first. The question whether or not all the relevant intermediates have been analyzed is difficult to answer, and mapping the intermediates onto the graphs presented here may help one to decide whether or not the pathways found are contiguous.
The authors are grateful for the comments of Drs. István Simon (Institute of Enzymology, Hungarian Academy of Sciences, Budapest) and Alessandro Pintar (ICGEB, Trieste). The work was supported by the Hungarian Office of Research and Development (OMFB-01887/2002, OMFB-00299/2002). S. P. is recipient of the Szent-Györgyi Award for teaching at the Department of Genetics and Molecular Biology, University of Szeged.
- Dobson CM, Karplus M: The fundamentals of protein folding: bringing together theory and experiment. Curr Opin Struct Biol 1999, 9: 92–101. 10.1016/S0959-440X(99)80012-8View ArticlePubMedGoogle Scholar
- Dinner AR, Sali A, Smith LJ, Dobson CM, Karplus M: Understanding protein folding via free-energy surfaces from theory and experiment. Trends Biochem Sci 2000, 25: 331–339. 10.1016/S0968-0004(00)01610-8View ArticlePubMedGoogle Scholar
- Pain RH: Mechanisms of Protein Folding. 2nd edition. Oxford, New York, Oxford University Press; 2000:433.Google Scholar
- Onuchic JN, Socci ND, Luthey-Schulten Z, Wolynes PG: Protein folding funnels: the nature of the transition state ensemble. Fold Des 1996, 1: 441–450.View ArticlePubMedGoogle Scholar
- Levinthal C: Are there pathways in protein folding? J Chim Phys 1968, 65: 44–45.Google Scholar
- Chang JY: Evidence for the underlying cause of diversity of the disulfide folding pathway. Biochemistry 2004, 43: 4522–4529. 10.1021/bi0360354View ArticlePubMedGoogle Scholar
- Wedemeyer WJ, Welker E, Scheraga HA: Proline cis-trans isomerization and protein folding. Biochemistry 2002, 41: 14637–14644. 10.1021/bi020574bView ArticlePubMedGoogle Scholar
- Welker E, Wedemeyer WJ, Narayan M, Scheraga HA: Coupling of conformational folding and disulfide-bond reactions in oxidative folding of proteins. Biochemistry 2001, 40: 9059–9064. 10.1021/bi010409gView ArticlePubMedGoogle Scholar
- Tu BP, Weissman JS: Oxidative protein folding in eukaryotes: mechanisms and consequences. J Cell Biol 2004, 164: 341–346. 10.1083/jcb.200311055PubMed CentralView ArticlePubMedGoogle Scholar
- Vishveshwara S, Brinda KV, Kannan N: Protein Structure: Insights from Graph Theory. Journal of Theoretical and Computational Chemistry 2002, 1: 187–211. 10.1142/S0219633602000117View ArticleGoogle Scholar
- Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577–2637. 10.1002/bip.360221211View ArticlePubMedGoogle Scholar
- Plaxco KW, Simons KT, Baker D: Contact order, transition state placement and the refolding rates of single domain proteins. J Mol Biol 1998, 277: 985–994. 10.1006/jmbi.1998.1645View ArticlePubMedGoogle Scholar
- Magyar C, Tudos E, Simon I: Functionally and structurally relevant residues of enzymes: are they segregated or overlapping? FEBS Lett 2004, 567: 239–242. 10.1016/j.febslet.2004.04.070View ArticlePubMedGoogle Scholar
- Selvaraj S, Gromiha MM: Importance of hydrophobic cluster formation through long-range contacts in the folding transition state of two-state proteins. Proteins 2004, 55: 1023–1035. 10.1002/prot.20109View ArticlePubMedGoogle Scholar
- Vendruscolo M, Paci E, Dobson CM, Karplus M: Three key residues form a critical contact network in a protein folding transition state. Nature 2001, 409: 641–645. 10.1038/35054591View ArticlePubMedGoogle Scholar
- Vendruscolo M, Paci E, Karplus M, Dobson CM: Structures and relative free energies of partially folded states of proteins. Proc Natl Acad Sci U S A 2003, 100: 14817–14821. 10.1073/pnas.2036516100PubMed CentralView ArticlePubMedGoogle Scholar
- Albert R, Jeong H, Barabasi AL: Error and attack tolerance of complex networks. Nature 2000, 406: 378–382. 10.1038/35019019View ArticlePubMedGoogle Scholar
- Dorogovtsev SN, Mendes JFF: Evolution of Networks: From Biological Nets to the Internet and Www (Physics). Oxford, New York, Oxford University Press; 2003:344.View ArticleGoogle Scholar
- Albert R, Barabasi AL: Statistical mechanics of complex networks. REVIEWS OF MODERN PHYSICS 2002, 74: 47–97. 10.1103/RevModPhys.74.47View ArticleGoogle Scholar
- Scala A, Amaral LAN, Barthelemy M: Small-world networks and the conformation space of a short lattice polymer chain. Europhys Lett 2001, 55: 594–600. 10.1209/epl/i2001-00457-7View ArticleGoogle Scholar
- Dokholyan NV, Shakhnovich B, Shakhnovich EI: Expanding protein universe and its origin from the biological Big Bang. Proc Natl Acad Sci U S A 2002, 99: 14132–14136. 10.1073/pnas.202497999PubMed CentralView ArticlePubMedGoogle Scholar
- Eigen M: On the nature of virus quasispecies. Trends Microbiol 1996, 4: 216–218. 10.1016/0966-842X(96)20011-3View ArticlePubMedGoogle Scholar
- Cantrell MA, Anderson D, Cerretti DP, Price V, McKereghan K, Tushinski RJ, Mochizuki DY, Larsen A, Grabstein K, Gillis S, et al.: Cloning, sequence, and expression of a human granulocyte/macrophage colony-stimulating factor. Proc Natl Acad Sci U S A 1985, 82: 6250–6254.PubMed CentralView ArticlePubMedGoogle Scholar
- Werner JM, Breeze AL, Kara B, Rosenbrock G, Boyd J, Soffe N, Campbell ID: Secondary structure and backbone dynamics of human granulocyte colony-stimulating factor in solution. Biochemistry 1994, 33: 7184–7192.View ArticlePubMedGoogle Scholar
- Weissman JS, Kim PS: The pro region of BPTI facilitates folding. Cell 1992, 71: 841–851. 10.1016/0092-8674(92)90559-UView ArticlePubMedGoogle Scholar
- Creighton TE: The disulfide folding pathway of BPTI. Science 1992, 256: 111–114.View ArticlePubMedGoogle Scholar
- Weissman JS, Kim PS: Reexamination of the folding of BPTI: predominance of native intermediates. Science 1991, 253: 1386–1393.View ArticlePubMedGoogle Scholar
- Hober S, Uhlen M, Nilsson B: Disulfide exchange folding of disulfide mutants of insulin-like growth factor I in vitro. Biochemistry 1997, 36: 4616–4622. 10.1021/bi9611265View ArticlePubMedGoogle Scholar
- Milner SJ, Carver JA, Ballard FJ, Francis GL: Probing the disulfide folding pathway of insulin-like growth factor-I. Biotechnol Bioeng 1999, 62: 693–703.View ArticlePubMedGoogle Scholar
- Yang Y, Wu J, Watson JT: Probing the folding pathways of long R(3) insulin-like growth factor-I (LR(3)IGF-I) and IGF-I via capture and identification of disulfide intermediates by cyanylation methodology and mass spectrometry. J Biol Chem 1999, 274: 37598–37604. 10.1074/jbc.274.53.37598View ArticlePubMedGoogle Scholar
- Chang JY, Li L, Lai PH: A major kinetic trap for the oxidative folding of human epidermal growth factor. J Biol Chem 2001, 276: 4845–4852. 10.1074/jbc.M005160200View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.