In-silico prediction of disorder content using hybrid sequence representation

BMC Bioinformatics

Table 1 Comparison of predictive quality of the DisCon and the disorder content extracted from the predictions of the 10 considered modern disorder predictors on the test dataset.

Predictor	Evaluation of the predicted disorder content										Evaluation of the predicted disorder at the residue-level
	MSE		MAE		PCC		% of chains		MAE
	value	stat. signif.	value	stat. signif.	value	stat. signif.	over-predicted	under-predicted	over-predicted	under-predicted	AUC	Accuracy	MCC
PROFbval	0.178	++	0.387	++	0.38	++	0.86	0.14	0.41	0.27	0.696	0.528	0.196
NORSnet	0.112	++	0.206	++	0.34	++	0.22	0.74	0.23	0.21	0.711	0.763	0.269
DISOclust	0.103	++	0.256	++	0.54	++	0.84	0.16	0.26	0.24	0.778	0.672	0.351
IUPRedL	0.083	++	0.172	=	0.47	++	0.40	0.57	0.14	0.20	0.767	0.785	0.365
MD	0.079	++	0.182	+	0.61	++	0.54	0.44	0.24	0.12	0.816	0.790	0.424
DISOPRED 2	0.076	++	0.167	=	0.49	++	0.57	0.41	0.14	0.22	0.780	0.771	0.382
MFDp	0.074	++	0.177	=	0.58	++	0.67	0.30	0.18	0.19	0.795	0.764	0.425
IUPRedS	0.070	+	0.155	=	0.53	++	0.49	0.48	0.10	0.22	0.771	0.795	0.366
Ucon	0.069	+	0.177	=	0.52	++	0.63	0.35	0.14	0.26	0.732	0.739	0.284
PONDR-FIT	0.066	+	0.167	=	0.55	++	0.65	0.34	0.13	0.24	0.776	0.777	0.383
DisCon	0.050		0.156		0.68		0.62	0.37	0.14	0.18	N/A	N/A	N/A

We report the MSE, MAE, and PCC values, the percentage of chains that are over-predicted (predicted with content higher than the native content) and under-predicted, and the MAE value for the over-and under-predicted chains. The methods are sorted in the descending order by the MSE values and the best values are shown in bold font. Results of the tests of significance of the differences between DisCon and the other methods are given in the "stat. signif." columns. The tests compare the absolute and the squared errors per-chain over all 200 chains in our test dataset, and Pearson correlation computed for 200 randomly selected sets of 100 proteins from the test dataset. The ++ and + denote that DisCon is statistically significantly better with p < 0.01 and p < 0.05, respectively, and = denotes that the results are not significantly different. We also report their Area under curve (AUC), Accuracy (ACC) and MCC for the per-residue disorder predictions generated by the ten considered predictors.

ISSN: 1471-2105