TAP score: torsion angle propensity normalization applied to local protein structure evaluation
© Tosatto and Battistutta; licensee BioMed Central Ltd. 2007
Received: 14 March 2007
Accepted: 15 May 2007
Published: 15 May 2007
Experimentally determined protein structures may contain errors and require validation. Conformational criteria based on the Ramachandran plot are mainly used to distinguish between distorted and adequately refined models. While the readily available criteria are sufficient to detect totally wrong structures, establishing the more subtle differences between plausible structures remains more challenging.
A new criterion, called TAP score, measuring local sequence to structure fitness based on torsion angle propensities normalized against the global minimum and maximum is introduced. It is shown to be more accurate than previous methods at estimating the validity of a protein model in terms of commonly used experimental quality parameters on two test sets representing the full PDB database and a subset of obsolete PDB structures. Highly selective TAP thresholds are derived to recognize over 90% of the top experimental structures in the absence of experimental information. Both a web server and an executable version of the TAP score are available at http://protein.cribi.unipd.it/tap/.
A novel procedure for energy normalization (TAP) has significantly improved the possibility to recognize the best experimental structures. It will allow the user to more reliably isolate problematic structures in the context of automated experimental structure determination.
The number of experimentally determined protein three-dimensional (3D) structures deposited in the protein data bank (PDB)  is increasing exponentially over the years and being progressively automated. The vast majority of such 3D structures is produced by X-ray crystallography. In the case of limited resolution and imperfect phase information often available to the crystallographer, building and refining such a protein model is a process that can depend on the experimentalist. Errors are almost unavoidable and the quality of the refined models has to be evaluated in order to assess their validity . Errors come in various classes  and can nowadays range from mistraced segments of the protein to locally incorrect backbone and/or side chain conformations. Identification of such errors can be achieved with a combination of experimental and computational parameters.
Several quality measures based on experimental parameters exist for X-ray crystallography and have been reviewed . X-ray resolution is an index of the quality of the experimental data. It is related to the amount of available data, i.e. to the parameter-to-observation ratio  and is an indirect indicator of the maximum attainable details of the protein model. In contrast, the R-free value  represents a measure of the fit between the refined structure and the electron density map, highlighting the quality of the refinement. The relationship with resolution is not straightforward, but it is generally assumed that higher resolution structures will produce lower R-free values . A different type of information is contained in the Luzzati  and σa  plots. These estimate the mean positional error for all atoms in the protein model. These parameters require the structure factors, but yield the expected uncertainty of atoms in the protein model in a single number estimate (in Å). A simpler estimation of the mean positional error is the diffraction precision index based on R-free (DPI), which can be computed from the readily available data contained in the PDB files .
A wide range of computational quality parameters have been developed and reviewed over the years [2, 10, 11]. Generally speaking, it is possible to distinguish geometric, energetic and conformational criteria. Geometric criteria are mainly standard values for bond lengths and angles derived from small molecule data. These form strong restraints and are generally enforced during the refinement process, so they possess little validation power. Energetic criteria are based on evaluation of interaction preferences or profiles [12–16]. While these methods can provide insight into the quality of the structure, their interpretation in experimental terms and feedback into the refinement process is rather difficult.
The most promising validation criteria are based on conformational criteria. The best example is the Ramachandran plot  of backbone (ϕ,φ) torsion angles. While each amino acid type may, in theory, adopt a large number of different conformations, large areas of the Ramachandran plot are almost empty. This is due to steric clashes deriving from the local geometry of the polypeptide chain. The main chain (ϕ,φ) torsion angles are usually not restrained during refinement and this makes the Ramachandran plot a powerful validation tool [2, 18]. Several tools have been developed to estimate the quality of a protein model based on the Ramachandran plot [16, 18–22]. Of these, PROCHECK  and WHAT_CHECK  are perhaps the most frequently used methods for validation in X-ray crystallography as they are used for judging structures to be deposited in the PDB, combining several stereochemical checks and measures of torsion angle compatibility. HOPPscore  has been recently developed to take into account higher order backbone torsion angle maps.
Several of these methods (e.g. WHAT_CHECK) are able to pinpoint the really wrong structures through a detailed analysis of different aspects of protein structures. Once a structure falls into the range of roughly plausible folds however the situation becomes more complicated. It is possible to construct structures with acceptable values for the standard criteria that are largely incompatible with the protein sequence. In the present work we focus on this aspect of experimental structure validation. Given roughly plausible structures, is it possible to quantify the degree of "nativeness" and highlight the best structures?
One possible limit to the previous methods is the difficulty in establishing a quantitative correspondence scale between different structures. I.e. how much is score X for structure A better or worse than score Y for structure B? The answer is not obvious, as the reference state is different for structures A and B. One solution would be a normalization procedure adapted to a particular structure. To the best of our knowledge, this has not been done yet. For this reason, we derive a novel measure of sequence to structure compatibility based on the normalization of torsion angle propensities including the side chain. The normalization involves definition of the global minimum and maximum of the protein sequence. The normalized propensity (called TAP) will be shown to be more accurate than several previous methods at quantifying the degree of "nativeness" of a protein model in terms of commonly used experimental quality parameters on two test sets representing the full PDB database and obsolete PDB structures. A comparison with several energetic criteria on standard protein decoys and theoretical models has already been addressed elsewhere [23, 24].
Baseline comparison on the all PDB set
Availability of data for the all PDB set.
All PDB set
Correlation coefficients between different experimental quality parameters on the all PDB set.
Correlation coefficients for the computational parameters on the all PDB set.
For the TAP score, the intermediate (ϕ,φ) bin size of 10 degrees shows the highest correlation (see also Figure 1). A similar trend was already observed . This probably maximizes the tradeoff between precise transitions and lack of data to discriminate certain sparsely populated regions in the Ramachandran plot. Note that a larger background distribution, e.g. covering the entire PDB, was excluded to avoid biasing the comparison.
It is apparent from Table 3 that some methods work significantly better than others. The statistical potentials (except the Ramachandran plot based TORS) and several geometric criteria do not yield good correlation coefficients. Performance of the TAP score against R-free is particularly interesting, as the correlation coefficients for R-free are overall lower. TAP has a higher correlation against R-free (-0.66; see Table 3) than the X-ray resolution (0.62; see Table 2) has. In order to evaluate the effect of the background distribution on the performance of TAP, a further test was made using the TAP score based on NMR derived torsion angles. The data reported in Table 3 shows that, while the usage of high-quality data improves the performance, TAP-NMR still significantly outperforms many other methods. For the sake of simplicity, further analysis was restricted to TAP and the most diverse parameters. As conformational criteria we have chosen PROCHECK, WC_Rama, WC_Chi1&2, Hopp1, Hopp2 and Hopp5. For the energetic criteria we have restricted our analysis to WC_Pack2, SOLV, RAPDF and TORS.
Detailed comparison on the obsolete PDB set
Quantifying absolute model accuracy
Distribution of parameters for the all PDB set. Minimum, maximum, average and standard deviation are shown.
Threshold levels used to define the experimental quality classes on the all PDB set.
The results for TAP on all three experimental quality classes are expressed in terms of accuracy and coverage (see Materials and Methods) and shown in Table 5 for the all PDB set. As can be expected, it becomes gradually more difficult for TAP to discriminate the structures with increasing quality level. At the same time, coverage drops with increasing TAP threshold. Taking the intersection between both, TAP recognizes ca. 90% of the medium quality structures with 90% accuracy. These values drop to ca. 75% for the high and 35% for the very high quality structures. Even in the latter case it implies a significant enrichment in discrimination with respect to a random predictor. To the best of our knowledge, this type of analysis was not performed before. In the context of automated experimental structure determination, it will allow the user to isolate problematic structures for manual refinement and could prove a valid addition to the PDB data deposition procedures.
Database effects vs. novel approach
The Ramachandran plot, i.e. (ϕ,φ) torsion angle preferences, has been seen as a powerful tool for validating experimental protein models for a long time [10, 18–20, 22]. Usually, the Ramachandran plot is used only as a rather qualitative tool to discriminate grossly mistraced structures from plausible ones. PROCHECK and HOPPscore both consider a generic Ramachandran plot for the twenty amino acids divided in discrete classes. WHAT_CHECK-2 uses a more sophisticated Z-score analysis of the Ramachandran plot. All of these do not appear to discriminate effectively the compatibility between sequence and structure, nor the subtle differences between amino acids.
The main advantage of TAP consists in effectively measuring the compatibility of the sequence with the proposed structure in a detailed, quantitative way. Energy score normalization is a novel concept which could be applied because TAP is based on a single body potential. This is not usually applicable to pair wise (or higher order) potentials, where it is difficult to estimate the maximum or minimum interaction between an amino acid and its surroundings. The benefits of energy normalization are apparent from the comparison between TAP and TORS, the torsion angle potential on which it is based. Where TORS gives rough indications, TAP (despite using similar information) has greater accuracy.
Since torsion angles are not generally restrained in X-ray crystallography, this compatibility is orthogonal to the data used in refinement and should be expected to give a good indication of the degree of "nativeness" of the protein model. To support this view, rather than a simple improvement based on database growth, a variant of TAP was derived from NMR data. Even in this case, where the Ramachandran plot is on average rather blurred, TAP-NMR still outperforms other validation tools. This supports the idea that it is capturing the relationship between sequence and structure rather than a tighter clustering in torsion angle space.
The main limit of the TAP score approach is the independence of subsequent residues in the calculation of the global minimum and maximum for normalization. These minimum and maximum are likely to be overestimated, as compatibility along the polypeptide chain is not guaranteed. This may result in impossibly "knotted" structures having the best pseudo-energy and the native structure being lower in normalized score. In principle, adding information about the preceding residue's (ϕ,φ) torsion angles would alleviate this situation and has been shown to yield more discriminative pseudo-energies . However, it is only possible to calculate the global optimum for normalization precisely because it is a single body potential. Adding a dependence on the preceding residue would transform it into a two-body potential, making the estimation of the global optimum problematic. Calculating the global optimum on such a two-body potential is an optimization problem in itself.
The discriminative power of torsion angle propensities has implications for the accuracy of empirical force fields such as AMBER  or CHARMM . Torsion angle propensities derive from the subtle interactions between neighboring residues which cannot be captured very precisely by currently available physico-chemical models . This knowledge is one of the reasons for the success of modern de novo folding methods based on assembly of short peptide fragments [31–34].
Small changes in the AMBER torsion angle parameter between param94 and param96 have drastic effects on the energy landscape . It may be argued that addition of a Ramachandran plot propensity parameter could improve the capacity of a force field to capture the local geometric details more precisely. This approach is frequently selected in loop modelling [36–38] where it is important to reconcile the structural restraints with sequence preference. The method of Fiser and co-workers  in particular uses the CHARMM bonded potential augmented by a Ramachandran plot propensity term and statistical non-bonded potential to generate loop conformations by minimization. Adding a properly calibrated (ϕ,φ) torsion angle propensity term to a force field may therefore help to improve convergence in energy minimization for molecular mechanics simulations. The option to calculate such values for every single residue also opens up the interesting possibility to use the TAP score as a valuable tool during refinement of a crystallographic model, in addition to the already available geometric validation tools.
An interesting question is why some crystal structures exhibit higher TAP scores than others. A cursory analysis reveals that the highest scoring structures are short helical bundles, e.g. Hemoglobin. The lowest scoring structures are diverse and contain any combination of α-helices and β-sheets. Even in the TOP500H database used for deriving the propensities, the TAP scores vary between 0.781 and 0.907 (avg = 0.847; σ = 0.017). This is comparable to the TAP score of very high experimental quality structures varying between 0.754 and 0.889 (avg = 0.822; σ = 0.017). This may be caused by the rough approximation of the global optimum overestimating some folds more than others due to some intrinsic feature, e.g. higher contact order or lower degree of flexibility. A better way to calculate the global optimum would be needed to exclude this possibility. As local interactions appear to impose the selection of certain amino acids at each structural position in the fold , it will however be worth investigating whether proteins whose native structure lies closer to the global optimum could perhaps also have a better sequence to structure compatibility.
We have presented a novel method for the evaluation of the quality of protein models determined by X-ray crystallography, demonstrated both on a large-scale dataset and a set of obsolete PDB structures. The TAP score is based on a relative pseudo-energy calculated simultaneously from the backbone and side chain torsion angle propensities, normalized against the global minimum and maximum for the protein sequence under consideration. Our results show a quantitative relationship between TAP score and the overall quality of experimental structures as expressed in terms of sequence to structure compatibility. TAP score can improve the confidence in quality validation of protein models derived from automated experimental procedures.
Torsion angle potential
The torsion angle pseudo-energy score E i for the i-th residue A of a protein is thus a measure of the log propensity that amino acid type A(i) will have torsion angle combination x(i). E i < 0 indicates that A(i) is favoured relative to the mean of all 20 amino acids, whereas E i > 0 indicates that it is disfavoured. In order to be applied, both the background distribution and the relevant torsion angles have to be defined.
The Top500H database  was chosen to derive the background distribution in order to have a representative subset of high quality structures that is small enough to allow unbiased large-scale benchmarking of the PDB. It is a non-redundant, hand-picked set of 500 high-resolution X-ray crystallographic protein structures resolved to 1.8 Å or better resolution with no obvious errors and less than 60% sequence identity. In order to assess the effect of the background distribution quality, an alternative ensemble of 609 NMR structures (9,578 models) was also used. As the Ramachandran plot of NMR structures is largely determined by the force field used in refinement, this alternative ensemble contains blurred transitions and serves to highlight the effect of the background distribution on TAP score accuracy (see discussion).
Two free parameters, the choice of torsion angles to represent and the discretization of the data, have to be chosen in order to define the measured torsion angle combinations. The (ϕ,φ) angles were discretized as either 5, 10 or 20 degree bins. Since other torsion angles are less informative, but still important, a limited number of bins was used to represent the additional torsion angles . Three bins were defined for the ω angle, distinguishing values [-180°, -150°], [+150°, +180°] and the rest. This was found to model the distribution of ω angles, where the cis (0°) state is very rare (except for proline) and the trans state preference is somewhat influenced by the Ramachandran plot  and has a slightly bimodal distribution around 180° (data not shown). Both the χ1 and χ2 torsion angles were discretized in eight bins centered on the canonical rotamer preferences. The total number of data points, i.e. non-terminal residues with all torsion angles available from the TOP500H is 100,245.
where Emin (Emax) is the sum of the lowest (highest) pseudo-energy, i.e. highest (lowest) propensity, torsion angle combination for a residue of type i. Note that this definition makes no assumption about the physical plausibility of the overall conformation. Indeed, it is entirely possible that a sequence of minimal (or maximal) states would produce an impossibly "knotted" structure.
The normalized torsion angle propensity TAP gives a rough indication of the degree of "nativeness" of a protein model. The value will be close to 1 for the native structure and close to 0 for structures with largely incompatible sequences. It is therefore a measure of compatibility between sequence and structure.
In order to evaluate the method for structure quality estimation on a large set, we downloaded the ASTRAL database  version 1.69 (December 2005) containing 68,057 domain sequences for 24,978 PDB structures  in 2,844 SCOP families . Of these, we considered only 17,685 structures determined by X-ray crystallography. To avoid bias towards the background distribution used to derive the torsion angle potential, we remove 355 structures belonging to the same SCOP families as the TOP500H structures. The programs failed to load ca. 1–2% of the structures, containing anomalous data usually from very old PDB files. The all PDB set is composed of 13,691 structures with both valid resolution and R-free values and results for all tested methods (see below). For protein complexes in this set, the scores are calculated for every chain and averaged. Luzzati and σA values were only used when derived from cross-validated data, in analogy to R-free. The DPI values were calculated from the PDB structures according to the published formula . Table 1 summarizes the available structures.
The second data set is based on obsolete PDB entries. A list containing pairs of PDB codes of PDB entries rendered obsolete since January 1990 and their replacement was downloaded from the PDB site. This resulted in a set of 494 pairs of PDB codes for which all methods tested produced valid output. The details for both data sets are available as supplementary material.
Methods used for comparison
The comparison with published methods is based on PROCHECK , WHAT_CHECK , HOPPscore  and FRST . All three programs were either downloaded from the author's website or directly requested (HOPPscore). For PROCHECK, the overall G-factor was used. WHAT_CHECK analysis is based on nine overall quality indicators available from the Pdbfinder2 database . These are: overall quality (WC_Qual) expressed as a sum of various terms, torsion angles (WC_Tors), Ramachandran plot appearance (WC_Rama), chirality (WC_Chir), backbone conformation (WC_Back), rotamer normality (WC_Rot), χ-1/χ-2 rotamer normality (WC_Chi1&2) and 1st and 2nd generation packing quality (WC_pack1 and WC_pack2). It should be noted that WC_Pack1 and WC_Pack2 are measures based on contact analysis. For HOPPscore, the five (Hopp-5) through single residue (Hopp-1) scores were calculated with default parameters. FRST is a linear combination of four different statistical potentials : a pairwise potential (RAPDF), solvation potential (SOLV), a simplified count of main chain hydrogen bonds (HYDB) and the torsion angle potential (TORS) on which TAP is built. All five (partial) potentials were used for analysis. The PROCHECK, WC_Rama, Hopp1 and TORS scores essentially represent a quantification of the structural fit with the Ramachandran plot.
where σ old is the standard deviation calculated over all S old for that particular method.
The experimental and computational parameters are analyzed in terms of Pearson correlation coefficient cc over the all PDB set. Fraction enrichment FE measures the percentage of good structures recognized at a threshold level t by each method . The structures are first ranked by R-free and by each method. FE at threshold t measures the percentage of structures in common between the top x percent of both lists. For the present work, the FE threshold is plotted in discrete steps of 5% from 5% to 50%. Intuitively, it becomes progressively easier for methods to have higher FE values at higher threshold levels. E.g. a good, but not perfect, method will be able to detect most of the good structures at t = 50%, but mostly fail at t = 5%.
where TP are the true positive predictions, i.e. where TAP correctly predicts a structure to be of a given quality class. (TP+FN) are all predictions made by TAP and (TP+FP) are all structures having a given quality class.
Availability and requirements
The TAP software is freely accessibile as a web server at http://protein.cribi.unipd.it/tap/. An executable version, written in ANSI C++ and precompiled for Linux machines, is also freely available for academic usage from http://protein.cribi.unipd.it/tap/download.shtml. Please contact the author for obtaining the source code and/or for commercial usage.
The authors are grateful to Alessandro Albiero for initial help with the test set and to Francisco Domingues for critical comments on the manuscript. Flavio Seno is thanked for insightful discussions and Gregory Sims for providing the HOPPscore program. S.T. is funded by a "Rientro dei cervelli" grant from the Italian Ministry for Education, University and Research (MIUR).
- Deshpande N, Addess KJ, Bluhm WF, Merino-Ott JC, Townsend-Merino W, Zhang Q, Knezevich C, Xie L, Chen L, Feng ZK, Green RK, Flippen-Anderson JL, Westbrook J, Berman HM, Bourne PE: The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema. Nucleic Acids Research 2005, 33: D233-D237. 10.1093/nar/gki057PubMed CentralView ArticlePubMedGoogle Scholar
- Kleywegt GJ: Validation of protein crystal structures. Acta Crystallogr D 2000, 56: 249–265. 10.1107/S0907444999016364View ArticlePubMedGoogle Scholar
- Branden CI, Jones TA: Between Objectivity and Subjectivity. Nature 1990, 343(6260):687–689. 10.1038/343687a0View ArticleGoogle Scholar
- Wilson MA, Brunger AT: The 1.0 angstrom crystal structure of Ca2+-bound calmodulin: an analysis of disorder and implications for functionally relevant plasticity. J Mol Biol J Mol Biol 2000, 301(5):1237–1256.View ArticlePubMedGoogle Scholar
- Brunger AT: Free R-Value - a Novel Statistical Quantity for Assessing the Accuracy of Crystal-Structures. Nature 1992, 355(6359):472–475. 10.1038/355472a0View ArticlePubMedGoogle Scholar
- Kleywegt GJ, Jones TA: Homo crystallographicus--quo vadis? Structure 2002, 10(4):465–472. 10.1016/S0969-2126(02)00743-8View ArticlePubMedGoogle Scholar
- Luzzati V: Traitement statistique des erreurs dans la determination des structures cristallines. Acta Crystallogr 1952, 5: 802–810. 10.1107/S0365110X52002161View ArticleGoogle Scholar
- Read RJ: Structure-Factor Probabilities for Related Structures. Acta Crystallogr A 1990, 46: 900–912. 10.1107/S0108767390005529View ArticleGoogle Scholar
- Cruickshank DW: Remarks about protein structure precision. Acta Crystallogr D Biol Crystallogr 1999, 55 ( Pt 3): 583–601. 10.1107/S0907444998012645View ArticleGoogle Scholar
- Who checks the checkers? Four validation tools applied to eight atomic resolution structures. EU 3-D Validation Network J Mol Biol 1998, 276(2):417–436. 10.1006/jmbi.1997.1526Google Scholar
- Laskowski RA, MacArthur MW, Thornton JM: Validation of protein models derived from experiment. Curr Opin Struct Biol 1998, 8(5):631–639. 10.1016/S0959-440X(98)80156-5View ArticlePubMedGoogle Scholar
- Luthy R, Bowie JU, Eisenberg D: Assessment of protein models with three-dimensional profiles. Nature 1992, 356(6364):83–85. 10.1038/356083a0View ArticlePubMedGoogle Scholar
- Sippl MJ: Recognition of errors in three-dimensional structures of proteins. Proteins 1993, 17(4):355–362. 10.1002/prot.340170404View ArticlePubMedGoogle Scholar
- Colovos C, Yeates TO: Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci 1993, 2(9):1511–1519.PubMed CentralView ArticlePubMedGoogle Scholar
- Melo F, Feytmans E: Assessing protein structures with a non-local atomic interaction energy. J Mol Biol 1998, 277(5):1141–1152. 10.1006/jmbi.1998.1665View ArticlePubMedGoogle Scholar
- Hooft RW, Vriend G, Sander C, Abola EE: Errors in protein structures. Nature 1996, 381(6580):272. 10.1038/381272a0View ArticlePubMedGoogle Scholar
- Ramachandran GN, Ramakrishnan C, Sasisekharan V: Stereochemistry of polypeptide chain configurations. J Mol Biol 1963, 7: 95–99.View ArticlePubMedGoogle Scholar
- Kleywegt GJ, Jones TA: Phi/psi-chology: Ramachandran revisited. Structure 1996, 4(12):1395–1400. 10.1016/S0969-2126(96)00147-5View ArticlePubMedGoogle Scholar
- Laskowski R, MacArthur MW, Moss D, Thornton J: Procheck: A program to check the stereochemical quality of protein structures. J Appl Cryst 1993, 26: 283–291. 10.1107/S0021889892009944View ArticleGoogle Scholar
- Hooft RW, Sander C, Vriend G: Objectively judging the quality of a protein structure from a Ramachandran plot. Comput Appl Biosci 1997, 13(4):425–430.PubMedGoogle Scholar
- Oldfield TJ: SQUID: a program for the analysis and display of data from crystallography and molecular dynamics. J Mol Graph 1992, 10(4):247–252. 10.1016/0263-7855(92)80077-QView ArticlePubMedGoogle Scholar
- Sims GE, Kim SH: A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. Proc Natl Acad Sci U S A 2006, 103(12):4428–4432. 10.1073/pnas.0511333103PubMed CentralView ArticlePubMedGoogle Scholar
- Tosatto SC: The Victor/FRST Function for Model Quality Estimation. J Comput Biol 2005, 12(10):1316–1327. 10.1089/cmb.2005.12.1316View ArticlePubMedGoogle Scholar
- Albiero A, Tosatto SC: Fine-grained statistical torsion angle potentials are effective in discriminating native protein structures. Curr Drug Discov Technol 2006, 3(1):75–81. 10.2174/157016306776637591View ArticlePubMedGoogle Scholar
- Esposito L, De Simone A, Zagari A, Vitagliano L: Correlation between omega and psi dihedral angles in protein structures. J Mol Biol 2005, 347(3):483–487. 10.1016/j.jmb.2005.01.065View ArticlePubMedGoogle Scholar
- Bower MJ, Cohen FE, Dunbrack RL Jr.: Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. J Mol Biol 1997, 267(5):1268–1282. 10.1006/jmbi.1997.0926View ArticlePubMedGoogle Scholar
- Dunbrack RL Jr.: Rotamer libraries in the 21st century. Curr Opin Struct Biol 2002, 12(4):431–440. 10.1016/S0959-440X(02)00344-5View ArticlePubMedGoogle Scholar
- Shapovalov MV, Dunbrack RL Jr.: Statistical and conformational analysis of the electron density of protein side chains. Proteins 2007, 66(2):279–303. 10.1002/prot.21150View ArticlePubMedGoogle Scholar
- Pearlman DA, Case DA, Caldwell JW, Ross WS, Cheatham TE, Debolt S, Ferguson D, Seibel G, Kollman P: Amber, a Package of Computer-Programs for Applying Molecular Mechanics, Normal-Mode Analysis, Molecular-Dynamics and Free-Energy Calculations to Simulate the Structural and Energetic Properties of Molecules. Comput Phys Commun 1995, 91(1–3):1–41. 10.1016/0010-4655(95)00041-DView ArticleGoogle Scholar
- MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz-Kuczera J, Yin D, Karplus M: All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B J Phys Chem B 1998, 102(18):3586–3616.View ArticlePubMedGoogle Scholar
- Chikenji G, Fujitsuka Y, Takada S: A reversible fragment assembly method for de novo protein structure prediction. J Chem Phys J Chem Phys 2003, 119(13):6895–6903.Google Scholar
- Jones DT, McGuffin LJ: Assembling novel protein folds from super-secondary structural fragments. Proteins 2003, 53 Suppl 6: 480–485. 10.1002/prot.10542View ArticlePubMedGoogle Scholar
- Rohl CA, Strauss CE, Misura KM, Baker D: Protein structure prediction using Rosetta. Methods Enzymol 2004, 383: 66–93.View ArticlePubMedGoogle Scholar
- Fang Q, Shortle D: Prediction of protein structure by emphasizing local side-chain/backbone interactions in ensembles of turn fragments. Proteins 2003, 53 Suppl 6: 486–490. 10.1002/prot.10541View ArticlePubMedGoogle Scholar
- Higo J, Ito N, Kuroda M, Ono S, Nakajima N, Nakamura H: Energy landscape of a peptide consisting of alpha-helix, 3(10)-helix, beta-turn, beta-hairpin, and other disordered conformations. Protein Sci 2001, 10(6):1160–1171. 10.1110/ps.44901PubMed CentralView ArticlePubMedGoogle Scholar
- DePristo MA, de Bakker PI, Lovell SC, Blundell TL: Ab initio construction of polypeptide fragments: efficient generation of accurate, representative ensembles. Proteins 2003, 51(1):41–55. 10.1002/prot.10285View ArticlePubMedGoogle Scholar
- Tosatto SCE, Bindewald E, Hesser J, Männer R: A divide and conquer approach to fast loop modeling. Protein Engineering 2002, 15(4):279–286. 10.1093/protein/15.4.279View ArticlePubMedGoogle Scholar
- Fiser A, Do RK, Sali A: Modeling of loops in protein structures. Protein Sci 2000, 9(9):1753–1773.PubMed CentralView ArticlePubMedGoogle Scholar
- Miyazawa S, Jernigan RL: Long- and short-range interactions in native protein structures are consistent/minimally frustrated in sequence space. Proteins 2003, 50(1):35–43. 10.1002/prot.10242View ArticlePubMedGoogle Scholar
- Sippl MJ: Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures. J Comput Aided Mol Des 1993, 7(4):473–501. 10.1007/BF02337562View ArticlePubMedGoogle Scholar
- Shortle D: Composites of local structure propensities: evidence for local encoding of long-range structure. Protein Sci 2002, 11(1):18–26. 10.1110/ps.ps.31002PubMed CentralView ArticlePubMedGoogle Scholar
- Lovell SC, Davis IW, Arendall WB, de Bakker PI, Word JM, Prisant MG, Richardson JS, Richardson DC: Structure validation by Calpha geometry: phi,psi and Cbeta deviation. Proteins 2003, 50(3):437–450. 10.1002/prot.10286View ArticlePubMedGoogle Scholar
- Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE: The ASTRAL Compendium in 2004. Nucleic Acids Res 2004, 32 Database issue: D189–92. 10.1093/nar/gkh034View ArticleGoogle Scholar
- Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 2004, 32 Database issue: D226–9. 10.1093/nar/gkh039View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.