In-silico prediction of disorder content using hybrid sequence representation
© Mizianty et al; licensee BioMed Central Ltd. 2011
Received: 22 October 2010
Accepted: 17 June 2011
Published: 17 June 2011
Intrinsically disordered proteins play important roles in various cellular activities and their prevalence was implicated in a number of human diseases. The knowledge of the content of the intrinsic disorder in proteins is useful for a variety of studies including estimation of the abundance of disorder in protein families, classes, and complete proteomes, and for the analysis of disorder-related protein functions. The above investigations currently utilize the disorder content derived from the per-residue disorder predictions. We show that these predictions may over-or under-predict the overall amount of disorder, which motivates development of novel tools for direct and accurate sequence-based prediction of the disorder content.
We hypothesize that sequence-level aggregation of input information may provide more accurate content prediction when compared with the content extracted from the local window-based residue-level disorder predictors. We propose a novel predictor, DisCon, that takes advantage of a small set of 29 custom-designed descriptors that aggregate and hybridize information concerning sequence, evolutionary profiles, and predicted secondary structure, solvent accessibility, flexibility, and annotation of globular domains. Using these descriptors and a ridge regression model, DisCon predicts the content with low, 0.05, mean squared error and high, 0.68, Pearson correlation. This is a statistically significant improvement over the content computed from outputs of ten modern disorder predictors on a test dataset with proteins that share low sequence identity with the training sequences. The proposed predictive model is analyzed to discuss factors related to the prediction of the disorder content.
DisCon is a high-quality alternative for high-throughput annotation of the disorder content. We also empirically demonstrate that the DisCon's predictions can be used to improve binary annotations of the disordered residues from the real-value disorder propensities generated by current residue-level disorder predictors. The web server that implements the DisCon is available at http://biomine.ece.ualberta.ca/DisCon/.
The intrinsically disordered proteins (IDPs), also referred to as natively unfolded or intrinsically unstructured proteins, lack stable tertiary structure in vitro. These proteins are implicated in numerous processes including cellular signal transduction, transcriptional regulation, and translation , and their prevalence was demonstrated in several human diseases [2, 3], including cancer , cardiovascular disease , neurodegenerative diseases [6, 7], genetic diseases , and amyloidoses . At the same time, the annotations of the IDPs are accumulated at a relatively low pace when compared with the growth of the number of known, non-redundant protein sequences. Over the last decade numerous sequence-derived characteristics, including low complexity , which was proposed in , high net charge and low content of hydrophobic amino acids [12, 13], lack of regular secondary structure , to name just a few, were found to differentiate between disordered and ordered regions. The abovementioned results suggest that disorder can be predicted from the sequence and they motivate the development of computational models for the prediction of the disordered regions. Several such predictors were already developed; see  for a recent review. Majority of the existing predictors generate the disorder predictions for each residue in the input protein chain. These per-residue predictors can be divided into 4 categories: i) methods that utilize the relative propensity of amino acids to form disorder/ordered regions which include GlobPlot , IUPred , FoldIndex , and Ucon ; ii) methods that are based on classifiers generated with the help of machine learning algorithms, such as DISpro , DISOPRED , DISOPRED2 , PrDOS , POODLE predictors [24, 25], PONDR predictors [10, 26, 27], Spritz , PROFbval , DisPSSMP , DisPSSMP2 , IUP , NORSnet  and OnD-CRFs ; iii) meta-approach methods that are based on a consensus of multiple base predictors including MULTICOM (also called PreDisorder) [35, 36], metaPrDOS  and recent MD , MFDp , and PONDR-FIT  predictors; and iv) approaches that find disordered residues through an analysis of the predicted 3D structural models such as PrDOS  and DISOclust . There are also methods that predict the propensity of the entire protein chain to be unstructured [13, 42–44]. One of these approaches is based on the charge-hydropathy plots  and another utilizes distributions of the predicted per-residue disorder scores [42–44]. The abovementioned per-residue and per-chain methods perform the predictions in a high-throughput manner and consequently they can be used as a possible solution to close the annotation gap.
Comparison of predictive quality of the DisCon and the disorder content extracted from the predictions of the 10 considered modern disorder predictors on the test dataset.
Evaluation of the predicted disorder content
Evaluation of the predicted disorder at the residue-level
% of chains
The overall disorder content was used in the past to estimate the abundance of intrinsic disorder in several protein databases [45, 46], in various protein families and classes [47–58], and in complete proteomes [59–62]. The high values of the disorder content were reported for several disease-related proteins [2–9]. The content was also used for the analysis of intrinsic disorder-related protein functions [63–65]. Importantly, in all these and similar cases, the disorder content was evaluated based on the results of either binary classifiers or was derived from the per-residue disorder predictions. As mentioned above, these per-residue disorder prediction methods may over-or under-predict the overall amount of disorder in the sequence. This observation and the fact that the knowledge of the disorder content in a given protein or in a set of proteins of interest or in an entire proteome can be utilized to investigate numerous important hypotheses motivate the development of new computational tools for the accurate prediction of the disorder content.
List of input information sources used by the disorder predictors considered in this work.
Data sources for the training/benchmark dataset(s)
Secondary structure prediction
Solvent accessibility prediction
PDB x-ray structures
PDB x-ray structures + curated chains
PDB x-ray structures
Predicted protein-protein interfaces, predicted domains
DisProt + PDB x-ray structures
Predicted residue contacts
DisProt + PDB x-ray structures
Predicted 3D structure
CASP7 + DisProt
DisProt + PDB x-ray structures
DisProt + PDB x-ray structures
DisProt + PDB x-ray structures
We empirically demonstrate that the DisCon's predictions are more accurate than the content extracted from the residue-level annotations generated by modern disorder predictors, including methods listed in Table 2. One of the potential applications of the predicted disorder content is to adjust the cut-offs used by the disorder predictors to annotate the disordered residues. We show that these annotations can be improved when the threshold values is adjusted for each chain such that the amount of the predicted disordered amino acids matches not only the native but also the predicted content.
Definition of Disorder
In the past CASP experiments the disordered residues were defined as the amino acids that lack coordinates in their crystal structures and, in the case of the structures solved by NMR, as the amino acids that exhibit high variability within an ensemble or that were annotated by experimentalists as disordered in the REMARK 465 [74, 75]. Another commonly used source for the disorder annotations is based on the experimentally-validated and biologically relevant disordered segments from the DisProt database . We note that the assignment of the disordered regions using different experimental methods was previously shown to be potentially inconsistent . Consequently, the disorder predictors that were developed using annotations provided with one method could lead to larger errors when used to predict annotations generated with the help of other methods [19, 33]. Therefore, we created a dataset that combines the CASP-defined annotations with the DisProt annotations.
The proposed method was designed and tested using a dataset that was developed to validate a recent meta-predictor of disordered residues, the MFDp . The protein chains were collected from the Protein Data Bank (PDB)  and the DisProt  databases. The culled PDB list  was used to derive a high-quality and low sequence identity subset of the PDB protein. More specifically, only the proteins for which the structure is characterized by R-factor < 0.2 and resolution < 2.0Å, and that are characterized by sequence identity < 25% were kept. We randomly selected 20% of the fully structured proteins among the resulting chains. This is motivated by the fact that many of chains selected using the culled PDB list are annotated as ordered while a recent study shows that completely ordered proteins are not highly abundant in PDB . The PDB chains were combined with all 523 proteins from the release 4.9 of the DisProt. The resulting dataset was filtered to reduce the pairwise sequence identity to below 25% by removing similar sequence with fewer disordered residues. Among the remaining 514 chains we removed four for which MD failed to produce predictions; this also resulted in lack of predictions from Ucon, PROFbval and NORSnet that are bundled with the MD predictions. Moreover, we improved the annotations of the DisProt chains using the procedure described in . We applied the approach based on the SL dataset  that combines the disorder annotations from the DisProt with the annotations of disorder and order based on the corresponding structural domains that can be found in PDB. We note that in contrast to the SL dataset that is based on the release 4.5 of DisProt, our annotations are based on the newer release 4.9. Finally, we also removed the HIS-tags that are introduced to ease the crystallization. The resulting dataset includes 305 chains from DisProt and 205 from PDB. This dataset was divided at random into two subsets, the training dataset with 310 chains and the test dataset with the remaining 200 chains.
We note that although there is some overlap between the training and test sequences (depending on the alignment tool used), they are mostly independent at the 25% similarity level. The training dataset was used to develop the predictor including selection of the input features and the parameterization of the prediction model, which were performed based on the 5-fold cross validation protocol. Next, our predictor that was computed using the training dataset was compared with the existing per-residue prediction methods using the test dataset. The training and test datasets are available at http://biomine.ece.ualberta.ca/DisCon/.
where y i ∈ Y is the native and x i ∈ X is predicted disorder content for the ith protein chain, avg X and avg Y are the sample means of X and Y, s x and s y are the sample standard deviations of X and Y.
where TP is the number of true positives (correctly predicted disordered residues), FP denotes false positives (structured residues that were predicted as disordered), TN denotes true negatives (correctly predicted structured residues), and FN denotes false negatives (disordered residues that were predicted as structured). Accuracy quantifies the overall success rate, i.e., fraction of correct predictions among all prediction, but since it may lead to misleading results when the dataset is unbalanced (which is the case here since majority of residues are structured) we also use MCC. The MCC values range between -1 and 1 and they are equal zero when all residues are predicted to be structured or to be disordered. Higher values of PCC, accuracy and MCC and lower values of MSE and MAE correspond to better predictions. We also evaluated the real-value, per-residue disorder predictions based on the area under the ROC curve (AUC) measure.
Overview of the proposed predictor
Feature-based encoding of the input protein sequence
The input sequence is processed to generate predictions of the 3-state SS, RSA, normalized  real-value B-factors, binary annotation of the residue flexibility as provided by PROFbval in two modes, the strict and the non-strict , binary annotation of residues that form globular domains, and sequence profiles encoded using PSSM and WOP. We normalized the ASA values predicted by Real-SPINE3 using the maximal ASA values provided in  and we preprocessed the 3-state SS by converting the predicted helices that had < 3 residues into coils. We also binarized the real-values RSA to annotate the residues as either solvent exposed when RSA > 0.25 or buried when RSA ≤ 0.25; this cut-off value was used in past studies [72, 84]. We also attempted to use signal peptide prediction provided by RPSP , but these features were removed during the feature selection. Detailed description of features is provided in Table A1 in the Additional File 1. We generated total of 614 features that are based on
composition of amino acids
length and relative location of predicted helix, strand and coil segments
composition of solvent exposed residues
composition of flexible residues and sequence segments composed of flexible residues
number and size of sequence segments that correspond to predicted globular domains
composition of residue predicted as signal peptides
fusion of the information coming from multiple predictions, including SS states, solvent exposure, flexibility, and domain annotations. We consider all combinations of two, three and four of the above predictions.
aggregations of the sequence profiles using entropy and relative (using background probability) entropy by both rows and columns of the PSSM and WOP
entropy-based aggregations of the sequence profiles encoded with PSSM and WOP which is performed for specific amino acid types, and for residues characterized by specific SS state, solvent exposure, flexibility, and domain annotations.
We emphasize that most of the features, in particular the features that are based on the secondary structure segments, flexible sequence segments, and that combine multiple predicted properties, are novel and unique to this work.
Design of the predictive model
The features were generated to comprehensively cover information that can be extracted from each predicted property, sequence and sequence profile, and their combinations. Consequently, some of these inputs may not be relevant to the prediction of the disorder content and some could be redundant with each other. We performed two-step feature selection to find a small set of non-redundant and relevant features; the second step also includes computation and parameterization of the predictive model. First, we remove the irrelevant and redundant features using a coarse-grained evaluation based on correlation, and next we perform a wrapper-based selection using the remaining features.
In the first step, for each feature we compute its average PCC with the disorder content (the PCC values are based on 5-fold cross validation on the training dataset and they are averages of the coefficients computed in the five training folds) and we remove the features with average absolute PCC value < 0.2. We selected the 0.2 cut-off as this value corresponds to a visible dip in the distribution of the correlation values, see Figure A1 in the Additional File 1. Next, we filtered the remaining 322 features to remove redundancy by assuring that the maximal average absolute PCC value between any pair of these features is < 0.9. Starting with the feature that has the highest average absolute PCC with the native content, we added another feature into the set of filtered features if the average (over the five training folds) absolute PCC between this feature and each feature which is already in the set of filtered features was < 0.9.
In the second step, we use the remaining 152 features to perform wrapper-based selection in which a subset of features that results in favourable performance in prediction of the disorder content is retained. We consider two types of predictors, ridge regression and Support Vector Regression (SVR) . The selection of the regression model is motivated by its successful application in several related areas, including evaluation of peptide identification  and prediction of folding rates [88, 89], solvent accessibility , secondary structure content , and affinity of protein-ligand complexes , to name a few. Similarly, the SVR also enjoys a wide range of relevant applications including prediction of B-factors , solvent accessibility , and half-sphere exposure . The values of the regression coefficients and the SVR models were estimated from the data in the training folds using WEKA workbench . We consider three types of kernel functions to build SVR models, polynomial, Radial Basis Function (RBF), and Pearson VII function-based Universal Kernel (PUK) . We parameterized each kernel and the complexity constant C by performing grid search. We use linear and quadratic polynomials, and C equal 2 x where x = -8, -7,..., 2; the RBF kernel with gamma (spread) equal 2 y where y = -11, -10,..., 2, and C values where x = -3, -2,..., 6; and the PUK kernel with omega equal 2 z where z = -4, -3,..., 1, and C values where x = -4, -3,..., 5. We also parameterized the ridge parameter in the ridge regression; we considered ridge values equal 10 w where w = -11, -9,..., 2. We first parameterized these 4 predictors (3 SVM types + 1 ridge regression) using a representative subset of the 152 features. We selected one features with the highest average absolute PCC from each of the feature groups defined in Table A1 in the Additional File 1. The representative subset includes 23 features since that number of groups was covered among the 152 features. Next, these parameterized predictors were used to perform feature selection in which we searched for a subset of features that results in the best MSE value. We performed forward and backward best first searches. The forward/backward best first search starts with the empty/entire (152 features) set of features, and it adds/removes one feature at the time if it decreases/increases the MSE value. The search stops when the entire list of features is scanned. As a result, we obtained 8 configurations of 4 predictors with 2 search types. The predictors in each configuration were parameterized using the grid search as described above. The parameterizations and all steps of the feature selection were executed based on multiple repetitions of 5-fold cross validations on the training dataset, and they aimed to minimize the average MSE score between the predicted and the native disorder content. We repeated the cross validations for up to five times using randomized division into the 5 folds for as long as the coefficient of variation (the ratio of the standard deviation to the mean) was below 0.02; this approach should assure a robust estimate of the MSE values. The parameters of the four predictors and the corresponding number of the selected features are given in Table A2 in the Additional File 1. The predictive performance, which was evaluated based on 5-fold cross validation on the training dataset, for the eight configurations is summarized in Table A3 in the Additional File 1. The best performance, in terms of the MSE and PCC values, is achieved with the ridge regression that uses 29 features selected using the forward best first search, and this configuration is used to implement the proposed DisCon predictor.
Results and Discussion
Disorder content prediction
We compare the performance of the DisCon with the results obtained using the disorder content computed from the disorder predictions generated by DISOPRED2, IUPred (both versions, IUPredL and IUPredS), PROFbval, NORSnet, Ucon, DISOclust, MD, PONDR-FIT, and MFDp methods. For the per-residue predictors we used the web servers or standalone implementations provided by the authors, and we calculated the content by counting the number of residues predicted as disordered and dividing it by the length of the corresponding protein chain. The results are computed on the test dataset with 200 chains which shares low identity to chains in our training dataset. We note that the methods we compare with use training datasets that may share higher similarity with the chains in our dataset, which could inflate their predictive quality. We also analyze statistical significance of the differences between the content predicted by DisCon and the other methods. We compare the per-chain values of the absolute errors and the squared errors over the 200 chains in the test dataset and the Pearson correlation coefficients computed for 200 randomly selected sets of 100 proteins from the test dataset. Since the measurements follow normal distribution (evaluated using Shapiro-Wilk test at 0.05 significance) we apply the paired t-test and we measure significance of the differences at 0.05 and 0.01 levels. We evaluate the extent of the over-and under-prediction of the disorder content by quantifying the number of the over-and under-predicted chains and the corresponding MAE values and we also provide the AUC, accuracy, and MCC values for the 10 considered per-residue predictors. The results are summarized in Table 1.
The DisCon is shown to provide favourable predictive performance. It obtains MSE equal 0.05 and PCC equal 0.68 on the test dataset. We note that these results are consistent with the results obtained based on the 5-fold cross validation on the training dataset (PCC = 0.70, MSE = 0.05; see Table A3 in the Additional File 1). On the test set, the best performing per-residue disorder predictors are worse than DisCon by 0.016 MSE and 0.07 PCC for the disorder content prediction. The average absolute error of DisCon equals 0.156 when compared with value at or over 0.167 obtained with the current disorder predictors, except for IUPredS for which MAE = 0.155. The improvements in MSE and PCC offered by DisCon are shown to be statistically significant when compared with all considered competitors. The MAE values computed from our predictions are significantly better than the errors based on the predictions with four existing methods and are equivalent with the remaining six predictors. Further analysis reveals that the quality of the DisCon predictions is better for longer chains, while some other methods may produce favorable predictions for short chains. Figure A2 in the Additional File 1, which shows the relation between the chain length and the absolute errors generated by the DisCon and the top-three methods from Table 1, i.e., IUPRedS, Ucon, and PONDR-FIT, demonstrates that the proposed predictor is characterized by smaller absolute errors for longer chains, while the other three predictors on average provide more accurate predictions for short chains. DisCon provides relatively balanced predictions with similar number of over-and under-predicted chains and low MAE values for these two types of errors. We observe that PROFbval and DISOclust are characterized by substantial levels of the over-prediction of the disorder content that are expressed by the large number of the over-predicted chains and/or high MAE for the over-predicted chains. The under-prediction of the disorder content is characteristic for the NORSnet method. Table 1 also shows that the Ucon which obtains relatively low MSE and mid-range PCC is characterized by lower quality of the per-residue predictions with MCC = 0.28.
Binary prediction of the disorder amount
Content guided thresholding of the real-value disorder prediction
The thresholding of the predicted real-value disorder using the content predicted by DisCon leads to improvements in both MCC and accuracy for all predictors except for the MD, in which case the accuracy is slightly improved but the MCC is lower. The average (across all methods) improvement in MCC and accuracy equal 0.03 and 0.05, respectively. When we use the combination of the content predicted with DisCon and MD the improvements are more substantial and they range between 0.01 and 0.14 for the MCC (on average 0.06) and 0.01 and 0.24 for the accuracy (on average 0.05); the best MCC is obtained using the predictions from the MFDp and it equals 0.45 when compared with 0.425 that was obtained without the content-based adjustment. Interestingly, using this cut-off adjustment the MCC values obtained by seven out of ten considered predictors are > 0.4 while originally (with the default cut-offs), see Table 1, only two methods have MCC > 0.4. This suggests that majority of the considered disorder predictors differentiate between structured and disordered residues based on their real-value propensities in a given chain with relatively similar quality, but only a few of them can accurately scale the range of the real-value propensities between sequences. The content-guided selection of the cut-offs alleviates the prediction bias, i.e., the tendency to under-or over-predict the amount of disorder. The binary predictions of PROFbval, DISOclust, and NORSnet that are originally characterized by relatively low MCC values and a bias towards either over-or under-prediction, see Table 1, are shown to improve by a wide margin when using the disorder content predicted by DisCon or by the combination of DisCon and MD. We observe that the relatively poor performance of the Ucon method does not stem from the prediction bias but rather from its overall problems with the quality of the residue-level annotations, as evidenced by the relatively low AUC and MCC in Table 1 which is in contrast to the sequence-level amount of disorder that is predicted quite accurately by this method.
We conclude that although predictions shown in the two case studies should not be assumed typical, they demonstrate that the content predicted with DisCon offers valuable assistance in selection of the cut-offs to annotate the disordered residues based on the real-values predictions from modern disorder predictors.
Factors related to the amount of disorder/order
The input features also highlight the importance of the relation between sequence conservation and the amount of the disorder, i.e., 13 out of the 29 features utilize entropy computed from the PSSM or WOP profiles. For instance, the EntAvePSSM feature, which computes the entropy of the average PSSM scores for each column (amino acid type) in the matrix along the sequence, has PCC = -0.5. This means that well-structured proteins are characterized on average by a stronger degree of sequence conservation when compared with the disordered proteins. Our observation is in agreement with the results of previous study, where the evolution rates of ordered and intrinsically disordered regions were compared using the pairwise genetic distances between the ordered and the disordered regions of 26 protein families having at least one member with a structurally characterized region of disorder of 30 or more consecutive residues . This study demonstrated that the disordered regions evolved significantly more rapidly than the ordered regions in 19 of the 26 families studied .
In spite of the fact that the quality of the high-throughput disorder prediction continues to improve , researchers recognize that new and more accurate predictors are still needed [38, 39]. We address the shortage of accurate methods that predict the overall amount of disorder in a given protein chain, which is motivated by the fact that current disorder predictors tend to provide relatively inaccurate estimates of the disorder content. We propose a novel approach, called DisCon, which combines information derived from sequence, sequence profiles, and predicted secondary structure, solvent accessibility, flexibility, and annotation of globular domains. We custom designed feature-based representation of the input protein chain that aggregates and combines these inputs and we performed feature selection that found a small set of 29 complementary features that are well correlated with the native disorder. Using these features and a ridge regression-based model, the DisCon predicts the disorder content with low, 0.05, mean squared error and high, 0.68, correlation, as evaluated on an independent test dataset. These predictions are empirically shown to be significantly better than the disorder content estimates derived from outputs of ten modern disorder predictors. The DisCon's predictions provide a high-quality alternative for high-throughput annotation of the disorder content. They are also shown to provide useful input to improve binary annotations of the disordered residues from the real-value disorder propensities generated by current disorder prediction methods.
List of abbreviations
Area Under the ROC Curve
Absolute Solvent Accessibility
Critical Assessment of Techniques for Protein Structure Prediction
Cumulative Distribution Function
Disorder Content predictor
Intrinsically Disordered Protein
Mean Absolute Error
Matthews Correlation Coefficient
Mean Squared Error
Nuclear Magnetic Resonance
Pearson Correlation Coefficient
Protein Data Bank
Position Specific Scoring Matrix
Pearson VII function-based Universal Kernel
Radial Basis Function
Relative Solvent Accessibility
Support Vector Regression
Weighted Observed Percentage.
This work was sponsored in part by the Discovery grant from NSERC Canada to LK, the National Institutes of Health grant (R01 GM 085003) to YZ, and the Killam Memorial Scholarship to MJM. The funding agencies did not participate in the design, collection, analysis, and interpretation of the data.
- Dunker AK, Oldfield CJ, Meng J, Romero P, Yang JY, Chen JW, Vacic V, Obradovic Z, Uversky V: The unfoldomics decade: an update on intrinsically disordered proteins. BMC Genomics 2008, 9(Suppl 2):S1. 10.1186/1471-2164-9-S2-S1View ArticleGoogle Scholar
- Uversky VN, Oldfield CJ, Midic U, Xie H, Vucetic S, Xue B, Iakoucheva LM, Obradovic Z, Dunker AK: Unfoldomics of human diseases: Linking protein intrinsic disorder with diseases. BMC Genomics 2009, 10(Suppl 1):S7. 10.1186/1471-2164-10-S1-S7PubMed CentralView ArticlePubMedGoogle Scholar
- Uversky VN, Oldfield CJ, Dunker AK: Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys 2008, 37: 215–246. 10.1146/annurev.biophys.37.032807.125924View ArticlePubMedGoogle Scholar
- Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z, Dunker AK: Intrinsic disorder in cell-signaling and cancer-associated proteins. J Mol Biol 2002, 323: 573–584. 10.1016/S0022-2836(02)00969-5View ArticlePubMedGoogle Scholar
- Cheng Y, LeGall T, Oldfield CJ, Dunker AK, Uversky VN: Abundance of intrinsic disorder in protein associated with cardiovascular disease. Biochemistry 2006, 45: 10448–10460. 10.1021/bi060981dView ArticlePubMedGoogle Scholar
- Raychaudhuri S, Dey S, Bhattacharyya NP, Mukhopadhyay D: The role of intrinsically unstructured proteins in neurodegenerative diseases. PLoS One 2009, 4(5):e5566. 10.1371/journal.pone.0005566PubMed CentralView ArticlePubMedGoogle Scholar
- Uversky VN: Intrinsic disorder in proteins associated with neurodegenerative diseases. Front Biosci 2009, 14: 5188–5238. 10.2741/3594View ArticleGoogle Scholar
- Midic U, Oldfield CJ, Dunker AK, Obradovic Z, Uversky VN: Protein disorder in the human diseasome: Unfoldomics of human genetic diseases. BMC Genomics 2009, 10(Suppl 1):S12. 10.1186/1471-2164-10-S1-S12PubMed CentralView ArticlePubMedGoogle Scholar
- Uversky VN: Amyloidogenesis of natively unfolded proteins. Curr. Alzheimer Res 2008, 5(3):260–287. 10.2174/156720508784533312PubMed CentralView ArticlePubMedGoogle Scholar
- Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK: Sequence complexity of disordered protein. Proteins 2001, 42: 38–48. 10.1002/1097-0134(20010101)42:1<38::AID-PROT50>3.0.CO;2-3View ArticlePubMedGoogle Scholar
- Wootton JC, Federhen S: Statistics of local complexity in amino acid sequences and sequence databases. Comput Chem 1993, 17: 149–163. 10.1016/0097-8485(93)85006-XView ArticleGoogle Scholar
- Dyson HJ, Wright PE: Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol 2005, 6: 197–208. 10.1038/nrm1589View ArticlePubMedGoogle Scholar
- Uversky VN, Gillespie JR, Fink AL: Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins 2000, 41: 415–427. 10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7View ArticlePubMedGoogle Scholar
- Liu J, Tan H, Rost B: Loopy proteins appear conserved in evolution. J Mol Biol 2002, 322: 53–64. 10.1016/S0022-2836(02)00736-2View ArticlePubMedGoogle Scholar
- He B, Wang K, Liu YL, Xue B, Uversky VN, Dunker AK: Predicting intrinsic disorder in proteins: An overview. Cell Research 2009, 19(8):929–949. 10.1038/cr.2009.87View ArticlePubMedGoogle Scholar
- Linding R, Russell RB, Neduva V, Gibson TJ: GlobPlot: Exploring protein sequences for globularity and disorder. Nucleic Acids Res 2003, 31: 3701–3708. 10.1093/nar/gkg519PubMed CentralView ArticlePubMedGoogle Scholar
- Dosztányi Z, Csizmok V, Tompa P, Simon I: IUPred: web server for the pre-diction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 2005, 21: 3433–3434. 10.1093/bioinformatics/bti541View ArticlePubMedGoogle Scholar
- Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I, Sussman JL: FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 2005, 21: 3435–3438. 10.1093/bioinformatics/bti537View ArticlePubMedGoogle Scholar
- Schlessinger A, Punta M, Rost B: Natively unstructured regions in proteins identified from contact predictions. Bioinformatics 2007, 23: 2376–2384. 10.1093/bioinformatics/btm349View ArticlePubMedGoogle Scholar
- Hecker J, Yang JY, Cheng J: Protein disorder prediction at multiple levels of sensitivity and specificity. BMC Genomics 2008, 9(Suppl 1):S9. 10.1186/1471-2164-9-S1-S9PubMed CentralView ArticlePubMedGoogle Scholar
- Jones DT, Ward JJ: Prediction of disordered regions in proteins from position specific score matrices. Proteins 2003, 53(Suppl 6):573–578.View ArticlePubMedGoogle Scholar
- Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT: The DISOPRED server for the prediction of protein disorder. Bioinformatics 2004, 20: 2138–2139. 10.1093/bioinformatics/bth195View ArticlePubMedGoogle Scholar
- Ishida T, Kinoshita K: PrDOS: prediction of disordered protein regions from amino acid sequence. Nucleic Acids Res 2007, 35: W460–464. 10.1093/nar/gkm363PubMed CentralView ArticlePubMedGoogle Scholar
- Shimizu K, Muraoka Y, Hirose S, Tomii K, Noguchi T: Predicting mostly disordered proteins by using structure-unknown protein data. BMC Bioinformatics 2007, 8: 78. 10.1186/1471-2105-8-78PubMed CentralView ArticlePubMedGoogle Scholar
- Hirose S, Shimizu K, Kanai S, Kuroda Y, Noguchi T: POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions. Bioinformatics 2007, 23: 2046–2053. 10.1093/bioinformatics/btm302View ArticlePubMedGoogle Scholar
- Peng K, Vucetic S, Radivojac P, Brown CJ, Dunker AK, Obradovic Z: Optimizing long intrinsic disorder predictors with protein evolutionary information. J. Bioinform. Comput. Biol 2005, 3: 35–60. 10.1142/S0219720005000886View ArticlePubMedGoogle Scholar
- Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z: Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics 2006, 7: 208. 10.1186/1471-2105-7-208PubMed CentralView ArticlePubMedGoogle Scholar
- Vullo A, Bortolami O, Pollastri G, Tosatto SC: Spritz: a server for the predic-tion of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 2006, 34: W164–168. 10.1093/nar/gkl166PubMed CentralView ArticlePubMedGoogle Scholar
- Schlessinger A, Yachdav G, Rost B: PROFbval: predict flexible and rigid residues in proteins. Bioinformatics 2006, 22: 891–893. 10.1093/bioinformatics/btl032View ArticlePubMedGoogle Scholar
- Su CT, Chen CY, Ou YY: Protein disorder prediction by condensed PSSM considering propensity for order or disorder. BMC Bioinformatics 2006, 7: 319. 10.1186/1471-2105-7-319PubMed CentralView ArticlePubMedGoogle Scholar
- Su CT, Chen CY, Hsu CM: iPDA: integrated protein disorder analyzer. Nucleic Acids Res 2007, 35: 465–472. 10.1093/nar/gkm353View ArticleGoogle Scholar
- Yang MQ, Yang JY: IUP: intrinsically unstructured protein predictor-a software tool for analyzing polypeptide sequences. Sixth IEEE Symposium on BioInformatics and BioEngineering: 16–18 October 2006; Arlington, Virginia, USA 2006, 3–11.View ArticleGoogle Scholar
- Schlessinger A, Liu J, Rost B: Natively unstructured loops differ from other loops. PLoS Comput Biol 2007, 3: e140. 10.1371/journal.pcbi.0030140PubMed CentralView ArticlePubMedGoogle Scholar
- Wang L, Sauer UH: OnD-CRF: predicting order and disorder in proteins using conditional random fields. Bioinformatics 2008, 24: 1401–1402. 10.1093/bioinformatics/btn132PubMed CentralView ArticlePubMedGoogle Scholar
- Cheng J, Sweredoski M, Baldi P: Accurate prediction of protein disordered regions by mining protein structure data. Data Mining Knowl Disc 2005, 11: 213–222. 10.1007/s10618-005-0001-yView ArticleGoogle Scholar
- Deng X, Eickholt J, Cheng J: PreDisorder: Ab initio sequence-based prediction of protein disordered regions. BMC Bioinformatics 2009, 10: 436. 10.1186/1471-2105-10-436PubMed CentralView ArticlePubMedGoogle Scholar
- Ishida T, Kinoshita K: Prediction of disordered regions in proteins based on the meta approach. Bioinformatics 2008, 24: 1344–1348. 10.1093/bioinformatics/btn195View ArticlePubMedGoogle Scholar
- Schlessinger A, Punta M, Yachdav G, et al.: Improved disorder prediction by combination of orthogonal approaches. PLoS One 2009, 4: e4433. 10.1371/journal.pone.0004433PubMed CentralView ArticlePubMedGoogle Scholar
- Mizianty MJ, Stach W, Chen K, Kedarisetti KD, Disfani F, Kurgan L: Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources. Bioinformatics 2010, 26(18):i489-i496. 10.1093/bioinformatics/btq373PubMed CentralView ArticlePubMedGoogle Scholar
- Xue B, Dunbrack RL, Williams RW, Dunker AK, Uversky VN: PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochim Biophys Acta 2010, 1804(4):996–1010.PubMed CentralView ArticlePubMedGoogle Scholar
- McGuffin LJ: Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 2008, 24: 1798–1804. 10.1093/bioinformatics/btn326View ArticlePubMedGoogle Scholar
- Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, Oldfield CJ, Campen AM, Ratliff CM, Hipps KW, et al.: Intrinsically disordered protein. J Mol Graph Model 2001, 19: 26–59. 10.1016/S1093-3263(00)00138-8View ArticlePubMedGoogle Scholar
- Oldfield CJ, Cheng Y, Cortese MS, Brown CJ, Uversky VN, Dunker AK: Comparing and combining predictors of mostly disordered proteins. Biochemistry 2005, 44: 1989–2000. 10.1021/bi047993oView ArticlePubMedGoogle Scholar
- Xue B, Oldfield CJ, Dunker AK, Uversky VN: CDF it all: consensus prediction of intrinsically disordered proteins based on various cumulative distribution functions. FEBS Lett 2009, 583(9):1469–1474. 10.1016/j.febslet.2009.03.070PubMed CentralView ArticlePubMedGoogle Scholar
- Romero P, Obradovic Z, Kissinger CR, Villafranca JE, Garner E, Guilliot S, Dunker AK: Thousands of proteins likely to have long disordered regions. Proceedings of the Pac Symp Biocomput.: 4–9 January 1998; Hawaii 1998, 437–448.Google Scholar
- Le Gall T, Romero P, Cortese MS, Uversky VN, Dunker AK: Intrinsic disorder in the Protein Data Bank. J. Biomol. Struct. Dyn 2007, 24(4):303–428.View ArticleGoogle Scholar
- Haynes C, Ji F, Oldfield CJ, Klitgord N, Cusick ME, Radivojac P, Uversky VN, Vidal M, Iakoucheva LM: Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes. PLoS Comput Biol 2006, 2(8):e100. 10.1371/journal.pcbi.0020100PubMed CentralView ArticlePubMedGoogle Scholar
- Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK: Intrinsic disorder in transcription factors. Biochemistry 2006, 45(22):6773–6888. 10.1021/bi0523815View ArticleGoogle Scholar
- Uversky VN, Roman A, Oldfield CJ, Dunker AK: Protein intrinsic disorder and human papillomaviruses: Increased amount of disorder in E6 and E7 oncoproteins from high risk HPVs. J Proteome Res 2006, 5(8):1829–1842. 10.1021/pr0602388View ArticlePubMedGoogle Scholar
- Dosztányi Z, Chen J, Dunker AK, Simon I, Tompa P: Disorder and sequence repeats in hub proteins and their implications for network evolution. J Proteome Res 2006, 5(11):2985–2995. 10.1021/pr060171oView ArticlePubMedGoogle Scholar
- Goh GK-M, Dunker AK, Uversky VN: A comparative analysis of viral matrix proteins using disorder predictors. Virology J 2008, 5: 126. 10.1186/1743-422X-5-126View ArticleGoogle Scholar
- Cortese MS, Uversky VN, Dunker AK: Intrinsic disorder in scaffold proteins: Getting more from less. Progress Bioph Mol Biol 2008, 98(1):85–106. 10.1016/j.pbiomolbio.2008.05.007View ArticleGoogle Scholar
- De Biasio A, Guarnaccia C, Popovic M, Uversky VN, Pintar P, Pongor S: Prevalence of intrinsic disorder in the intracellular region of human single-pass type I proteins: The case of the Notch ligand Delta-4. J Proteome Res 2008, 7(6):2496–2506. 10.1021/pr800063uView ArticlePubMedGoogle Scholar
- Hébrard E, Bessin Y, Michon T, Longhi S, Uversky VN, Delalande F, Van Dorsselaer A, Romero P, Walter J, Declerk N, et al.: Intrinsic disorder in viral proteins genome-linked: Experimental and predictive analyses. Virology J 2009, 6: 23. 10.1186/1743-422X-6-23View ArticleGoogle Scholar
- Balázs A, Csizmok V, Buday L, Rakács M, Kiss R, Bokor M, Udupa R, Tompa K, Tompa P: High levels of structural disorder in scaffold proteins as exemplified by a novel neuronal protein, CASK-interactive protein1. FEBS J 2009, 276(14):3744–3756. 10.1111/j.1742-4658.2009.07090.xView ArticlePubMedGoogle Scholar
- Hegyi H, Buday L, Tompa P: Intrinsic structural disorder confers cellular viability on oncogenic fusion proteins. PLoS Comput Biol 2009, 5(10):e1000552. 10.1371/journal.pcbi.1000552PubMed CentralView ArticlePubMedGoogle Scholar
- Tompa P, Kovacs D: Intrinsically disordered chaperones in plants and animals. Biochem Cell Biol 2010, 88(2):167–174. 10.1139/O09-163View ArticlePubMedGoogle Scholar
- Xue B, Williams RW, Oldfield CJ, Goh GK-M, Dunker AK, Uversky VN: Viral disorder or disordered viruses: Do viral proteins possess unique features? Prot. Pept. Lett 2010, 17(8):932–951. 10.2174/092986610791498984View ArticleGoogle Scholar
- Dunker AK, Obradovic Z, Romero P, Garner EC, Brown CJ: Intrinsic protein disorder in complete genomes. Genome Inform 2000, 11: 161–171.Google Scholar
- Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT: Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 2004, 337: 635–645. 10.1016/j.jmb.2004.02.002View ArticlePubMedGoogle Scholar
- Tompa P, Dosztanyi Z, Simon I: Prevalent structural disorder in E. coli and S. cerevisiae proteomes. J Proteome Res 2006, 5(8):1996–2000. 10.1021/pr0600881View ArticlePubMedGoogle Scholar
- Xue B, Williams RW, Oldfield CJ, Dunker AK, Uversky VN: Archaic chaos: Intrinsically disordered proteins in Archaea. BMC Systems Biol 2010, 4(Suppl 1):S1. 10.1186/1752-0509-4-S1-S1View ArticleGoogle Scholar
- Xie H, Vucetic S, Iakoucheva LM, Oldfield CJ, Dunker AK, Obradovic Z, Uversky VN: Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins. J Proteome Res 2007, 6: 1917–1932. 10.1021/pr060394ePubMed CentralView ArticlePubMedGoogle Scholar
- Xie H, Vucetic S, Iakoucheva LM, Oldfield CJ, Dunker AK, Uversky VN, Obradovic Z: Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J Proteome Res 2007, 6: 1882–1898. 10.1021/pr060392uPubMed CentralView ArticlePubMedGoogle Scholar
- Vucetic S, Xie H, Iakoucheva LM, Oldfield CJ, Dunker AK, Obradovic Z, Uversky VN: Functional anthology of intrinsic disorder. 2. Cellular components, domains, technical terms, developmental processes, and coding sequence diversities correlated with long disordered regions. J Proteome Res 2007, 6: 1899–1916. 10.1021/pr060393mPubMed CentralView ArticlePubMedGoogle Scholar
- Vucetic S, Brown CJ, Dunker AK, Obradovic Z: Flavors of protein disorder. Proteins 2003, 52: 573–584. 10.1002/prot.10437View ArticlePubMedGoogle Scholar
- Williams RM, Obradovic Z, Mathura V, Braun W, Garner EC, Young J, Takayama S, Brown CJ, Dunker AK: The protein non-folding problem: amino acid determinants of intrinsic order and disorder. Proceedings of the Pac Symp Biocomput.:3–7 January 2001; Hawaii 2001, 89–100.Google Scholar
- Uversky VN, Dunker AK: Understanding protein non-folding. Biochim. Biophys. Acta-Proteins and Proteomics 2010, 1804(6):1231–1264. 10.1016/j.bbapap.2010.01.017View ArticleGoogle Scholar
- Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic Z, Uversky VN, Dunker AK: Intrinsic disorder and functional proteomics. Biophys J 2007, 92: 1439–1456. 10.1529/biophysj.106.094045PubMed CentralView ArticlePubMedGoogle Scholar
- Vacic V, Uversky VN, Dunker AK, Lonardi S: Composition Profiler: a tool for discovery and visualization of amino acid composition differences. BMC Bioinformatics 2007, 8: 211. 10.1186/1471-2105-8-211PubMed CentralView ArticlePubMedGoogle Scholar
- Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ, Lawson JD, Dunker AK: Protein flexibility and intrinsic disorder. Protein Sci 2004, 13: 71–80. 10.1110/ps.03128904PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang H, Zhang T, Chen K, Shen S, Ruan J, Kurgan L: On the relation between residue flexibility and local solvent accessibility in proteins. Proteins 2009, 76: 617–636. 10.1002/prot.22375View ArticlePubMedGoogle Scholar
- Lieutaud P, Canard B, Longhi S: MeDor: a metaserver for predicting protein disorder. BMC Genomics 2008, 9(Suppl 2):S25. 10.1186/1471-2164-9-S2-S25PubMed CentralView ArticlePubMedGoogle Scholar
- Bordoli L, Kiefer F, Schwede T: Assessment of disorder predictions in CASP7. Proteins 2007, 69(Suppl 8):129–136.View ArticlePubMedGoogle Scholar
- Noivirt-Brik O, Prilusky J, Sussman J: Assessment of disorder predictions in CASP8. Proteins 2009, 77(Suppl 9):210–216.View ArticlePubMedGoogle Scholar
- Sickmeier M, Hamilton JA, LeGall T, Vacic V, Cortese MS, Tantos A, Szabo B, Tompa P, Chen J, Uversky VN, et al.: DisProt: the database of disordered proteins. Nucleic Acids Res 2007, 35: D786–793. 10.1093/nar/gkl893PubMed CentralView ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Wang G, Dunbrack RL Jr: PISCES: a protein sequence culling server. Bioinformatics 2003, 19: 1589–1591. 10.1093/bioinformatics/btg224View ArticlePubMedGoogle Scholar
- Sirota FL, Ooi HS, Gattermayer T, Schneider G, Eisenhaber F, Maurer-Stroh S: Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset. BMC Genomics 2010, 11(Suppl 1):S15. 10.1186/1471-2164-11-S1-S15PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Jones DT, Swindells MB: Getting the most from PSI-BLAST. Trends Biochem Sci 2002, 27: 161–164. 10.1016/S0968-0004(01)02039-4View ArticlePubMedGoogle Scholar
- McGuffin LJ, Bryson K, Jones DT: The PSIPRED protein structure prediction server. Bioinformatics 2000, 16: 404–405. 10.1093/bioinformatics/16.4.404View ArticlePubMedGoogle Scholar
- Faraggi E, Xue B, Zhou Y: Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by fast guided-learning through a two-layer neural network. Proteins 2009, 74: 857–871. 10.1002/prot.22194View ArticleGoogle Scholar
- Dor O, Zhou Y: Real-SPINE: an integrated system of neural networks for real-value prediction of protein structural properties. Proteins 2007, 68: 76–81. 10.1002/prot.21408View ArticlePubMedGoogle Scholar
- Plewczynski D, Slabinski L, Ginalski K, Rychlewski L: Prediction of signal peptides in protein sequences by neural networks. Acta Biochim Pol 2008, 55: 261–267.PubMedGoogle Scholar
- Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK: Improvements to the SMO algorithm for SVM regression. IEEE Trans. Neural Networks 2000, 11(5):1188–1193. 10.1109/72.870050View ArticlePubMedGoogle Scholar
- Xu H, Yang L, Freitas MA: A robust linear regression based algorithm for automated evaluation of peptide identifications from shotgun proteomics by use of reversed-phase liquid chromatography retention time. BMC Bioinformatics 2008, 9: 347. 10.1186/1471-2105-9-347PubMed CentralView ArticlePubMedGoogle Scholar
- Gao J, Zhang T, Zhang H, Shen S, Ruan J, Kurgan L: Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility. Proteins 2010, 78(9):2114–2130.PubMedGoogle Scholar
- Jiang Y, Iglinski P, Kurgan L: Prediction of protein folding rates from primary sequences using hybrid sequence representation. J Comput Chem 2009, 30(5):772–83. 10.1002/jcc.21096View ArticlePubMedGoogle Scholar
- Wagner M, Adamczak R, Porollo A, Meller J: Linear regression models for solvent accessibility prediction in proteins. J Comput Biol 2005, 12(3):355–369. 10.1089/cmb.2005.12.355View ArticlePubMedGoogle Scholar
- Homaeian L, Kurgan L, Ruan J, Cios KJ, Chen K: Prediction of protein secondary structure content for the twilight zone sequences. Proteins 2007, 69(3):486–498. 10.1002/prot.21527View ArticlePubMedGoogle Scholar
- Sotriffer CA, Sanschagrin P, Matter H, Klebe G: SFCscore: scoring functions for affinity prediction of protein-ligand complexes. Proteins 2008, 73(2):395–419. 10.1002/prot.22058View ArticlePubMedGoogle Scholar
- Pan XY, Shen HB: Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection. Protein Pept Lett 2009, 16(12):1447–1454. 10.2174/092986609789839250View ArticlePubMedGoogle Scholar
- Chang DT, Huang HY, Syu YT, Wu CP: Real value prediction of protein solvent accessibility using enhanced PSSM features. BMC Bioinformatics 2008, 9(Suppl 12):S12. 10.1186/1471-2105-9-S12-S12PubMed CentralView ArticlePubMedGoogle Scholar
- Song J, Tan H, Takemoto K, Akutsu T: HSEpred: predict half-sphere exposure from protein sequences. Bioinformatics 2008, 24(13):1489–1497. 10.1093/bioinformatics/btn222View ArticlePubMedGoogle Scholar
- Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH: The WEKA Data Mining Software: an update. SIGKDD Explor 2009, 11: 10–18. 10.1145/1656274.1656278View ArticleGoogle Scholar
- Uestuen B, Melssen WJ, Buydens LMC: Facilitating the application of Support Vector Regression by using a universal Pearson VII function based kernel. Chemometrics Intel. Lab. Sys 2006, 81: 29–40. 10.1016/j.chemolab.2005.09.003View ArticleGoogle Scholar
- Hymowitz SG, O'Connell MP, Ultsch MH, Hurst A, Totpal K, Ashkenazi A, de Vos AM, Kelley RF: A unique zinc-binding site revealed by a high-resolution X-ray structure of homotrimeric Apo2L/TRAIL. Biochemistry 2000, 39(4):633–640. 10.1021/bi992242lView ArticlePubMedGoogle Scholar
- Whitby FG, Luecke H, Kuhn P, Somoza JR, Huete-Perez JA, Phillips JD, Hill CP, Fletterick RJ, Wang CC: Crystal structure of Tritrichomonas foetus inosine-5'-monophosphate dehydrogenase and the enzyme-product complex. Biochemistry 1997, 36(35):10666–10674. 10.1021/bi9708850View ArticlePubMedGoogle Scholar
- Brown CJ, Takayama S, Campen AM, Vise P, Marshall TW, Oldfield CJ, Williams CJ, Dunker AK: Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 2002, 55(1):104–110. 10.1007/s00239-001-2309-6View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.