Skip to main content

Table 1 Performances of machine-learning models on the benchmark training and independent test datasets. Values shown are mean ± SD for the training dataset

From: Co-AMPpred for in silico-aided predictions of antimicrobial peptides by integrating composition-based features

Algorithm Dataset Accuracy AUROC Recall Precision Kappa MCC
GBC Training 75.0%
± 0.038
0.816
± 0.035
77.4%
± 0.082
73.9%
± 0.033
0.500
± 0.0755
0.504
± 0.075
Test 80.3% 0.873 79.7% 80.6% 0.606 0.606
CatBoost Training 74.4%
± 0.055
0.815
± 0.045
75.3%
± 0.107
73.9%
± 0.045
0.488
± 0.110
0.492
± 0.109
Test 78.7% 0.879 78.7% 78.7% 0.574 0.574
LGBM Training 73.8%
± 0.060
0.810
± 0.052
73.3%
± 0.124
73.8%
± 0.039
0.476
± 0.102
0.479
± 0.099
Test 77.6% 0.868 78.7% 77.0% 0.553 0.553
ETC Training 74.3%
± 0.055
0.794
± 0.066
75.0%
± 0.097
73.9%
± 0.049
0.487
± 0.109
0.491
± 0.108
Test 77.6% 0.776 77.6% 77.6% 0.553 0.553
RF Training 74.1%
± 0.044
0.798
± 0.052
75.5%
± 0.101
73.1%
± 0.039
0.482
± 0.088
0.487
± 0.086
Test 78.1% 0.811 78.7% 77.8% 0.563 0.563
  1. The given data in bold font indicates the top performance of the model on the test dataset
  2. GBC, gradient boosting classifier; LGBM, light gradient boosting machine; ETC, extra trees classifier; RF, random forest; AUROC, area under the receiver operating characteristics curve; MCC, Mathew's correlation coefficient; SD, standard deviation