Representing and comparing protein structures as paths in three-dimensional space
© Zhi et al; licensee BioMed Central Ltd. 2006
Received: 18 May 2006
Accepted: 20 October 2006
Published: 20 October 2006
Most existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction.
We propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure.
Although our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver.
Knowledge of protein three-dimensional (3-D) structure is a prerequisite to understanding its function at a molecular level. With more than 37,000 protein structures in the rapidly growing public repository PDB , the importance of computer algorithms that can rapidly compare and find remote similarities between these structures cannot be over-emphasized. The comparison of protein structures has been an extremely important problem in structural and evolutionary biology ever since the first few protein structures became available. Hundreds of algorithms for protein structure comparison have been developed; there are several large databases and WEB resources devoted almost entirely to the problem of comparing and classifying protein structures, such as SCOP [2, 3], CATH [4, 5], and the DALI domain dictionary .
Typically, different representations of protein structure are employed for different contexts of structure comparisons. For example, an all-atom protein model is useful when studying finer details of a protein structure such as the subtle changes in the side-chain conformations of the active site residues upon substrate binding. However, for the rapid comparison of protein structures in order to find global similarities, only one point per residue, often the position of its Cα atom, is generally sufficient. Some programs use completely different representations of protein structures, such as distance matrices , secondary structure vectors , or mesostates of backbone dihedral angles .
All protein structure alignment programs optimize some mathematical definition of structural similarity. The most popular measure of structural similarity is the root mean squared deviation (RMSD) of the aligned atoms  and its variants . In general, alignments optimizing different measures of structural similarity may be different from each other . Moreover, structural alignment is an NP-hard computational problem  and in order to solve it in a realistic time various heuristics have been developed, such as, lowering the dimensionality of the problem by identifying 7 × 7 residue interaction patterns in DALI , describing the protein as a set of vectors based on secondary structure elements in VAST , or using local structural similarities to identify short aligned fragment pairs (AFPs), which are used later to construct the alignment in methods such as CE  and FATCAT .
Since algorithms that optimize RMSD dominate the field of structure comparison, they create a misconception that only structures that can be superimposed with reasonable RMSD criteria, such as low RMSD over a large number of residues of the proteins, should be considered similar. While this is a pragmatic definition of structural similarity that eliminates an excess of false-positive matches, it fails to find similarities between structures with extensive conformation changes including structures with internal rearrangements and/or with swapped elements between domains. The recent years have seen advances in algorithms that can align protein structures assuming flexibility of their polypeptide chains [14, 15]. Expert-curated structure classifications (such as SCOP and CATH) have dealt with this problem indirectly, by using highly abstracted, but not precisely defined, views of protein structure (fold) and by grouping together protein structures based on a combination of sequence, structural, functional, and evolutionary information. The rapid accumulation of new structures, however, outpaces the manual curation efforts, and automatic means of detecting structural similarities, which are beyond the scope of RMSD-based structure alignment programs, are becoming essential.
In this manuscript, we propose a very general abstraction of protein structure that views it as a path in 3-D space, and describe a novel dynamic programming algorithm for structure comparison by aligning the turning angle series and comparing our results with the structural similarity defined by the SCOP database. Surprisingly, even at this clearly oversimplified level of protein structure description, our benchmarking results are in a good agreement with the SCOP classification and existing structure alignment programs. Due to the flexibility encoded in our formulation, we demonstrate that our methods can find uses in assessing structure predictions, comparing structures with extensive distortion, modeling structure families, and revealing potential remote homology.
Aligning angle series along smoothed backbone
Our approach for abstracting the protein structure is inspired by an earlier work from our group  and the U-turn model . We developed a highly simplified description of protein structure that minimizes local structural information by "smoothing" the protein backbone, leaving only information about whether a protein chain is locally straight or curved. In particular, we "smooth" the protein backbone by averaging Cα position in a seven-residue window . Chain fragments that remain straight after the smoothing procedure are denoted as generalized secondary structure elements. Local secondary structural information is partially lost, and protein structure is abstracted to a path in 3-D space, which for a typical protein structure winds through space by following a straight line for a 5–12 residues, then turning in a typically 4–5 residue turn only to assume a straight course for another 5–12 residues.
The idea of 1-D geometric descriptions of structures and the dynamic programming alignment methods have been explored previously [20, 21], including the use of curvature and torsion angles of the backbone to describe the local chain structure . However, these methods differ significantly from our method at the level of structure abstraction in that they typically provide a much richer and more detailed description of the protein. Our method focuses only on turning angles in a generalized (smoothed) protein backbone, thus providing a somewhat minimal structure description. The idea of protein backbone smoothing was explored before in different contexts [23, 24]. Interestingly, as we will show in this manuscript, in the world of natural protein structures, this minimal information encoded in our representation is often sufficient to recognize similarity between structures.
We have implemented a dynamic programming procedure as a computer program, CURVE, which compares structures by aligning the turning angles along their generalized backbones. Below we first present our benchmarking results, and then discuss applications of CURVE as applied to different contexts of structure comparison.
The angle series alignment mostly agrees with existing measures of structural similarity
Benchmarking result of CURVE compared against CE and CTSS.
In our second experiment, we take representative structural domains from three of the major SCOP classes, all-alpha, all-beta, and alpha-and-beta (a/b) proteins, and compare them against all structures in a 90% non-redundant set of SCOP version 1.65 (SCOP165_90)  which contained 8666 structures.
Immunoglobulins are a large family of proteins with 404 structures represented in the SCOP165_90 set. Our second example is a search with one of the immunoglobulin structures, 1clo:h1. CURVE was able to identify 376 immunoglobulins among the top 400 hits. 23 of those 24 false positives belong to the all-beta class; the only non-all-beta hit belongs to the a+b class.
Arguably, this is a result of our simplistic affine gap penalty scheme. Most of the false positives have a larger number of residues in gaps. In fact, if one requires that number of residues in gaps to be less than 30, the top 400 hits contain only 9 non-immunoglobulin hits and all of them are from the all-beta class.
Our third search is with 1b7b:A, which is a relatively small a/b protein, from the Carbamate kinase-like fold. CURVE scores all structures from this fold higher than structures from other folds.
In summary, we found that although CURVE uses very limited structural information, its performance is comparable to CTSS (another curvature-based method) and even to full-fledged structure comparison programs. In the database search setting, CURVE can always recognize the query's structure family. For all example structures we tested, the top scoring hits returned by CURVE are indeed from the same families.
Benchmarking against existing structure alignment programs
We also like to point out that a major advantage of CURVE is its speed. Theoretically the CURVE algorithm resembles the standard Smith-Waterman algorithm, with the time complexity O(nm), where n and m are the lengths of the two backbone chains to be compared. In our benchmarking test, CURVE took a total of 4629 CPU hours on a cluster with Pentium III 1 GHz CPUs. This result is comparable to LSQMAN, which was reported taking 1790 hours over 2.6 GHz CPUs (Table 5 of ). If we consider that CURVE is currently implemented as a prototype using Perl, and typically a reimplementation using C/C++ would speed up dramatically from Perl, this result suggests that CURVE can be used as a tool for fast filtration of potential hits before the application of more time-consuming structure alignment programs.
Evaluating structure predictions
Notably, the CURVE score and the GDT_TS measure of some predicted structures do not agree: they either have a high CURVE and low GDT_TS score, or vice versa. We examine both types of disagreements. We find that predictions with a large CURVE score but a low GDT_TS measure, such T0251TS122_1, correspond to a prediction where the fold has been predicted correctly but the predicted secondary structure elements are shifted. On the other hand, predictions with a good GDT_TS measure but a low CURVE score, such T0251AL164_1, are models that are not be able to predict the overall topology. Nonetheless, we recognize that the evaluation of structure prediction is a subjective and difficult task. Our test demonstrates that CURVE provides a complementary perspective for such evaluations to that defined by RMSD-based criteria.
Describing differences among NMR conformers
Recognizing similarity between drastically different conformations of the same protein
Some proteins assume drastically different structural conformations at different conditions (such as binding to different substrates) to fulfill their functions. The similarity between different structural conformations of a protein can go beyond what traditional RMSD-based structure alignment tools can recognize. Below we demonstrate that our method is particularly suited in identifying similarities between structures with such conformation changes.
In the second example we compare the immunoglobulin-binding domain B1 of streptococcal protein G (GB1), a favorite subject of studying protein folding and design. Mutants of GB1 are reported to adopt very different conformation from the wild type . The wild type structure (PDBID: 3gb1) contains an α-helix and a four-stranded β-sheet made of two β-hairpins, one N-terminal and the other C-terminal to the α-helix. The structure of mutant HS#124F26A (PDBID: 1q10) reveals a domain-swapped dimer that involves exchange of the second β-hairpin. The resulting overall structure is comprised of an eight-stranded β-sheet whose concave side is covered by two α-helices. CURVE alignment reveals that the most significant angle change happens at the region between the α-helix and the second β-hairpin (Figure 7); all secondary structures remain mostly unchanged. In both cases, the conformation change results in structures which can align only with large RMSD, while the changes on turning angles are modest. In such cases, aligning structures by directly optimizing RMSD may not be a good choice. CURVE alignment directly captures the backbone turning angle changes associated with the conformational changes, which, we believe, is a better choice.
Turning-angle profile model of a structure family
We have shown that the CURVE representation not only reveals similarity between structures, but also pinpoints the regions where they differ. We explore the generalization of CURVE representation to the modeling of multiple structure alignment of a protein family.
We obtain multiple structure alignment of 10 structures of the triosephosphate isomerase family of the TIM-barrel topology from the expert-curated structure alignment database HOMSTRAD . We overlay the turning angle profiles of these structures according to the HOMSTRAD alignment in Figure 8. In addition to the angle-variability index, we also calculate the sequence conservation index, defined as the number of matching pairs of amino acids in a given column in a multiple alignment. Not surprisingly, there is a negative correlation of the angle-variability index and the sequence conservation index.
Revealing similarities between structures from distinct folds but sharing structural (and often functional) similarities
The results presented in our manuscript bring up an interesting question: Since turning angle curve similarity is only a necessary, but not sufficient condition for structural similarity, why does CURVE alignment work so well? We postulate that this is because most natural proteins are constrained into a compact shape, and thus for a given turning angle series, there are only a small number of ways to arrange them into a realistic compact shape. For example, turning angle series cannot distinguish between right-handed and left-handed β/α/β units. Fortunately, right-handed β/α/β connections dominate over left-handed ones in naturally occurring proteins.
Our result also raises another interesting question on the structural constraints of protein evolution. For most structures, changes in their sequences caused by mutations such as substitutions and minor insertions or deletions only result in subtle changes in structure with the overall 3-D shape of the structure largely being preserved. However, for some structures, such as the GB1 protein, small mutations can result in a drastic change of their structural conformation and CURVE can be useful in detecting such changes.
The angle series alignment has some interesting implications. Traditional structure alignments have never been like sequence alignment. While sequence alignments typically define an edit distance, a score defined by a procedure via which one can transform one sequence into the other, existing structure alignment programs optimize RMSD of a superimposed subset of residues among structures. The result of such a structure alignment does not provide a series of operations that transform one structure into the other. Angle series alignment produces a set of angle matches that could be interpreted as a series of operations for structural transformation. Naively, one can bend every angle of one structure to the corresponding angle of the other structure. To derive a set of realistic backbone-bending operations, one needs to consider the stereochemical constraints of the backbone and correlation of the turning angles.
The current prototype implementation of angle series alignment certainly can be improved by incorporating additional information. For example, the alignment of angle curves can only give an alignment with a certain resolution due to the smoothing procedure. It is possible to implement an iterative refinement scheme which starts with an overall alignment of angle series based on a large smoothing radius, then iteratively refine the alignment by considering angle series based on smaller smoothing radii. Since angle series is only a "planar" feature, adding 3-D features such as handedness information will help (in cases where distinguishing between left and right is important).
In this paper we introduce the turning angle series along smoothed backbone of the polypeptide as a new descriptor of protein structure. We demonstrate its utility in defining structural similarity by implementing and testing an alignment program, CURVE, based on this feature. Our results show that this simple approach works surprisingly well. Although not directly optimizing RMSD, the result of CURVE generally agrees with the SCOP structure classification and traditional structural alignment programs. Benchmarking results showed that CURVE's performance is comparable to popular structure alignment programs such as LSQMAN, while CURVE runs significantly faster. Moreover, CURVE can reveal similarities between drastically different conformations of the same protein structure, which is beyond the scope of traditional structure alignment programs. In aligning structures from different SCOP folds CURVE demonstrate its potential in identifying remote structural similarity.
Backbone smoothing and turning angles
Our backbone smoothing procedure follows that of . We assign the center of gravity of every k consecutive Cα atoms as a new pseudo-Cα atom. With a proper choice of k, the resulting chain of pseudo-Cα smoothes out the local "wiggles" due to the zigzags in β-strands or the spiral patterns in the α-helices and reveals the global fold of the protein structure as a smooth curve in 3-D space. Thus, we refer the chain of pseudo-Cα's as a smoothed backbone (Figure 1A). Our smoothing procedure suppresses the local high frequency curvature signals that arise from the local periodicity of the backbone, and thus reveals the overall topology of the structure.
We define the turning angle at each pseudo-Cα atom along the smoothed backbone in order to reflect medium level topological features around it. Ideally, this turning angle should be close to 180° in the middle of a long straight segment along the smoothed backbone and small (close to 0°) at a sharp turn such as a β-hairpin. Also following the definition in , we define the turning angles at residue i as the angle between the two vectors [i-d+1,i-d] and [i+d-1,i+d]. The value of d determines the span of the angle definition, thus d is called the angle defining distance. Assuming that d is small, the fragment from residue i-d to i+d is almost planar and the torsional angles are negligible, this definition can be interpreted as the integral of the curvature function of the chain in this local interval.
We experimented with different choices of d (Figure 1B). Small values of d make all angles indistinctively large: they only capture local turns and are unable, for example, to describe the 180° turn in anti-parallel β-sheets. Large values of d, however, are uninformative for revealing local curvatures. Figure 1B demonstrates the effect of value d on the shape of the angle series curve: with decreasing d values, the first two plateaus in curve d = 1 dissolve into narrower and lower peaks, while the valley between them becomes deeper and wider. We choose d = 3 since it is the smallest value which gives a good dynamic range of angle values.
It is worth to note some general features of the angle series description of a protein structure. First, the plateaus and peaks (regions with high angle values) correspond to straight parts after smoothing, often long secondary structure elements or generalized secondary structure elements. For instance, in Figure 1B, the first plateau/peak corresponds to the first α-helix in structure and the second peak corresponds to the β-strand after the first α-helix). However, some straight segments do not correspond to classical secondary structure elements , we call such regions generalized secondary structure elements. Second, the valleys correspond to points where the path changes direction. Turning angle series is a rich description of the chain topology that also includes detailed turning characteristics, such as the length of the turn and the type of secondary structure (or generalized secondary structure) elements, with the latter described by the density of points along the smoothed chain .
In order to uniquely specify a path in three dimensions, both curvature and torsion angles would be required. The information about the torsion angles is lost in the representation of the path used here; therefore, it cannot distinguish whether the next straight element after a turn would be to the left or right of the original element. However, as we have shown in our study, using only the curvature angle series, we can still recognize most cases of the structural similarity between actual protein structures.
Aligning turning angle series
We treat the turning angle series as a sequence of numbers. A natural way to compare such sequences is via dynamic programming. Protein and DNA sequences are described by discrete alphabet and could be aligned by the well known dynamic programming algorithms ([18, 19]). The alignment of sequences of continuous numbers is rarely used in bioinformatics; however, it is very well studied in computer science as the time warp problem. Essentially, given two series indexed by time, the objective of time warp is to find the optimal matching between the points along the two time series. Typically, mismatches are penalized by the squared deviation of two time points (see  for a review).
In this study, we employ a standard time warp setting. Given two turning angle series (a i ) and (b j ), the goal is to find a maximally scoring gapped local alignment between them. The total score is the sum of scores of matching turning angle pairs with affine gap penalties. We adopt the standard affine gap penalty scheme. And we define the score for matching a pair of angles a i and b j as of the form -(a i - b j )2, i.e., the penalty of aligning two angle values increases quadratically with their angle difference. To avoid over-penalizing a large angle difference, the score has a lower cap. If all matching scores were negative, the optimal alignment would be of zero length. To encourage longer alignments, the matching score is augmented by a default reward r0. Any angle difference smaller than r0 is rewarded, otherwise it is penalized. Thus, the overall score for matching a pair of angles a i and b j is:
S(a i , b j ) = r0 2 - min [(a i - b j )2, (1.5r0)2].
Although our simplistic scoring scheme may produce unrealistic alignments (such as creating large gaps in the middle of secondary structure elements), we find that this scheme produces overall structure alignments that while not accurate enough for comparative structure modeling, are yet good enough to discover the overall structural similarity.
r0 and the gap opening and extension penalties are adjustable parameters. Based on parameter-tuning tests (data not shown), we found the alignment is not very sensitive to the choices of r0 and gap penalties as long as the alignment is in the log-phase . In our experiments we choose default parameters to be r0 = 21 and gap opening/extension penalties 300/100. All these procedures are implemented as a program CURVE, available both via a webserver  and as a supplementary file (Additional file 1).
We thank Yuzhen Ye for discussions at early stage of this project. This research was supported by NIH grants GM63208 and GM62411.
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28(1):235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247(4):536–540. 10.1006/jmbi.1995.0159PubMedGoogle Scholar
- Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 2004, 32(Database issue):D226–9. 10.1093/nar/gkh039PubMed CentralView ArticlePubMedGoogle Scholar
- Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM: CATH--a hierarchic classification of protein domain structures. Structure 1997, 5(8):1093–1108. 10.1016/S0969-2126(97)00260-8View ArticlePubMedGoogle Scholar
- Pearl F, Todd A, Sillitoe I, Dibley M, Redfern O, Lewis T, Bennett C, Marsden R, Grant A, Lee D, Akpor A, Maibaum M, Harrison A, Dallman T, Reeves G, Diboun I, Addou S, Lise S, Johnston C, Sillero A, Thornton J, Orengo C: The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucl Acids Res 2005, 33(suppl_1):D247–251.PubMed CentralPubMedGoogle Scholar
- Dietmann S, Holm L: Identification of homology in protein structure classification. Nat Struct Biol 2001, 8(11):953–957. 10.1038/nsb1101-953View ArticlePubMedGoogle Scholar
- Holm L, Sander C: Protein structure comparison by alignment of distance matrices. J Mol Biol 1993, 233(1):123–138. 10.1006/jmbi.1993.1489View ArticlePubMedGoogle Scholar
- Madej T, Gibrat JF, Bryant SH: Threading a database of protein cores. Proteins 1995, 23(3):356–369. 10.1002/prot.340230309View ArticlePubMedGoogle Scholar
- Gong H, Rose GD: Does secondary structure determine tertiary structure in proteins? Proteins 2005, 61(2):338–343. 10.1002/prot.20622View ArticlePubMedGoogle Scholar
- Rao ST, Rossmann MG: Comparison of super-secondary structures in proteins. J Mol Biol 1973, 76(2):241–256. 10.1016/0022-2836(73)90388-4View ArticlePubMedGoogle Scholar
- Ortiz AR, Strauss CE, Olmea O: MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 2002, 11(11):2606–2621. 10.1110/ps.0215902PubMed CentralView ArticlePubMedGoogle Scholar
- Godzik A: The structural alignment between two proteins: is there a unique answer? Protein Sci 1996, 5(7):1325–1338.PubMed CentralView ArticlePubMedGoogle Scholar
- Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 1998, 11(9):739–747. 10.1093/protein/11.9.739View ArticlePubMedGoogle Scholar
- Ye Y, Godzik A: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2003, 19 Suppl 2: II246-II255.PubMedGoogle Scholar
- Shatsky M, Nussinov R, Wolfson HJ: Flexible protein alignment and hinge detection. Proteins 2002, 48(2):242–256. 10.1002/prot.10100View ArticlePubMedGoogle Scholar
- Jaroszewski L, Godzik A: Search for a new description of protein topology and local structure. Proc Int Conf Intell Syst Mol Biol 2000, 8: 211–217.PubMedGoogle Scholar
- Kolinski A, Skolnick J, Godzik A, Hu WP: A method for the prediction of surface "U"-turns and transglobular connections in small proteins. Proteins 1997, 27(2):290–308. 10.1002/(SICI)1097-0134(199702)27:2<290::AID-PROT14>3.0.CO;2-HView ArticlePubMedGoogle Scholar
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970, 48(3):443–453. 10.1016/0022-2836(70)90057-4View ArticlePubMedGoogle Scholar
- Smith TF, Waterman MS: Identification of common molecular subsequences. J Mol Biol 1981, 147(1):195–197. 10.1016/0022-2836(81)90087-5View ArticlePubMedGoogle Scholar
- Krissinel E, Henrick K: Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 2004, 60(Pt 12 Pt 1):2256–2268. 10.1107/S0907444904026460View ArticlePubMedGoogle Scholar
- Harrison A, Pearl F, Sillitoe I, Slidel T, Mott R, Thornton J, Orengo C: Recognizing the fold of a protein structure. Bioinformatics 2003, 19(14):1748–1759. 10.1093/bioinformatics/btg240View ArticlePubMedGoogle Scholar
- Can T, Wang YF: Protein structure alignment and fast similarity search using local shape signatures. Journal of Bioinformatics and computational biology 2004, 2(1):215–239. 10.1142/S0219720004000533View ArticlePubMedGoogle Scholar
- Aszodi A, Gradwell MJ, Taylor WR: Global fold determination from a small number of distance restraints. J Mol Biol 1995, 251(2):308–326. 10.1006/jmbi.1995.0436View ArticlePubMedGoogle Scholar
- Taylor WR: Protein structural domain identification. Protein Eng 1999, 12(3):203–216. 10.1093/protein/12.3.203View ArticlePubMedGoogle Scholar
- Li Z, Ye Y, Godzik A: Flexible structural neighborhood -- a database of protein structural similarities and alignments. Nucl Acids Res 2006, 34(suppl_1):D277–280. 10.1093/nar/gkj124PubMed CentralView ArticlePubMedGoogle Scholar
- Kolodny R, Koehl P, Levitt M: Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol 2005, 346(4):1173–1188. 10.1016/j.jmb.2004.12.032PubMed CentralView ArticlePubMedGoogle Scholar
- Kleywegt GJ: Use of non-crystallographic symmetry in protein structure refinement. Acta Crystallogr D Biol Crystallogr 1996, 52(Pt 4):842–857. 10.1107/S0907444995016477View ArticlePubMedGoogle Scholar
- Taylor WR, Orengo CA: Protein structure alignment. J Mol Biol 1989, 208(1):1–22. 10.1016/0022-2836(89)90084-3View ArticlePubMedGoogle Scholar
- Zemla A: LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res 2003, 31(13):3370–3374. 10.1093/nar/gkg571PubMed CentralView ArticlePubMedGoogle Scholar
- Wang G, Jin Y, Dunbrack RLJ: Assessment of fold recognition predictions in CASP6. Proteins 2005, 61 Suppl 7: 46–66. 10.1002/prot.20721View ArticlePubMedGoogle Scholar
- Yamniuk AP, Vogel HJ: Calmodulin's flexibility allows for promiscuity in its interactions with target proteins and peptides. Mol Biotechnol 2004, 27(1):33–57. 10.1385/MB:27:1:33View ArticlePubMedGoogle Scholar
- Byeon IJ, Louis JM, Gronenborn AM: A captured folding intermediate involved in dimerization and domain-swapping of GB1. J Mol Biol 2004, 340(3):615–625. 10.1016/j.jmb.2004.04.069View ArticlePubMedGoogle Scholar
- Mizuguchi K, Deane CM, Blundell TL, Overington JP: HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 1998, 7(11):2469–2471.PubMed CentralView ArticlePubMedGoogle Scholar
- Jaroszewski L, Rychlewski L, Li Z, Li W, Godzik A: FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res 2005, 33(Web Server issue):W284–8. 10.1093/nar/gki418PubMed CentralView ArticlePubMedGoogle Scholar
- Sankoff D, Kruskal JB: Time warps, string edits, and macromolecules: The theory and practice of sequence comparison. Reading, MA, Addison-Wesley Publishing Company.; 1983.Google Scholar
- Waterman MS, Gordon L, Arratia R: Phase transitions in sequence matches and nucleic acid structure. Proc Natl Acad Sci U S A 1987, 84(5):1239–1243. 10.1073/pnas.84.5.1239PubMed CentralView ArticlePubMedGoogle Scholar
- CURVE web server[http://pops.burnham.org/curve]
- Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22(12):2577–2637. 10.1002/bip.360221211View ArticlePubMedGoogle Scholar
- DeLano WL: The PyMOL Molecular Graphics System.[http://www.pymol.org]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.