Skip to main content

Table 2 Performance of models with different ML algorithms and the concatenated descriptors

From: Prediction of polyreactive and nonspecific single-chain fragment variables through structural biochemical features and protein language-based descriptors

Descriptor

Method

Train AUC

Valid AUC

Test AUC

Accuracy

Precision

Recall

F1-score

F46/UniRep

GBM

1.000 ± 0.000

0.624 ± 0.163

0.834

0.759

0.751

0.706

0.728

LGBM

0.928 ± 0.006

0.638 ± 0.166

0.830

0.746

0.719

0.729

0.724

RF

1.000 ± 0.000

0.594 ± 0.174

0.823

0.747

0.742

0.683

0.711

XGB

0.971 ± 0.002

0.624 ± 0.166

0.826

0.752

0.728

0.729

0.729

F46/TAPE

GBM

1.000 ± 0.000

0.669 ± 0.137

0.829

0.748

0.745

0.680

0.711

LGBM

0.919 ± 0.006

0.676 ± 0.147

0.826

0.746

0.720

0.726

0.723

RF

1.000 ± 0.000

0.654 ± 0.153

0.823

0.749

0.752

0.671

0.709

XGB

0.997 ± 0.000

0.668 ± 0.151

0.831

0.748

0.727

0.718

0.722

F46/ESM-1b

GBM

1.000 ± 0.000

0.643 ± 0.154

0.830

0.755

0.746

0.704

0.724

LGBM

0.882 ± 0.008

0.654 ± 0.157

0.821

0.744

0.709

0.746

0.727

RF

1.000 ± 0.000

0.622 ± 0.169

0.826

0.750

0.748

0.680

0.713

XGB

0.954 ± 0.004

0.648 ± 0.153

0.823

0.750

0.721

0.738

0.729

F46/ESM-1v

GBM

0.985 ± 0.002

0.643 ± 0.139

0.814

0.738

0.741

0.654

0.695

LGBM

0.834 ± 0.010

0.657 ± 0.134

0.802

0.727

0.677

0.766

0.719

RF

1.000 ± 0.000

0.611 ± 0.165

0.827

0.751

0.750

0.681

0.714

XGB

0.951 ± 0.003

0.643 ± 0.151

0.822

0.743

0.709

0.742

0.725

  1. The bold means the best performance, the AUC score in the test set