BMC Bioinformatics

Table 2 Statistical significances of features of cross-validation and blind data sets in discriminating large deviations from small

From: Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions

Features	SMM^PMBEC	NetMHC	NetMHCpan
log_size_cv	7.7e-07	2.5e-04	2.5e-02
log_size_bl	2.9e-05	3.6e-03	1.2e-02
entss_cv	1.1e-04	1.7e-03	2.0e-02
entss_bl	3.4e-05	3.9e-04	5.1e-03
ent_meas_cv	1.7e-01	5.5e-01	4.6e-01
ent_meas_bl	4.6e-01	5.4e-01	8.5e-01
ent_pred_cv	1.5e-01	2.1e-01	2.0e-01
ent_pred_bl	4.8e-03	6.4e-02	1.1e-02
prbol_meas	3.5e-01	9.9e-02	3.1e-01
prbol_pred	7.8e-03	3.7e-02	2.8e-02

Here, deviation = |cv_gs - blind|, where blind and cv_gs correspond to predictive performances in AROCs. Significant features (t-test; two-tailed at 0.05 cutoff) are italicized. See Methods for definitions of the features.

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com