Skip to main content

Table 3 Individual and average AUCs and MCCs from the validation phase and the additional validation approaches applied to the T2D datasets. The standard deviation of each result was excluded to keep the table simple and avoid complexity

From: Methodology for biomarker discovery with reproducibility in microbiome data using machine learning

PRJNA325931*

9 features (REFS)

3316 features

SelectKbest (k = 9)

Classifier

AUC

MCC

AUC

MCC

AUC

MCC

AdaBoostClassifier

0.8000

0.6749

0.4500

0.0438

0.7600

0.4530

Extra trees

0.8400

0.7532

0.5000

0.0000

0.7600

0.5512

KNeighbors

0.6500

0.4033

0.5000

− 0.0428

0.6500

0.2319

MLP

0.8800

0.8064

0.5200

− 0.0083

0.8200

0.6792

Lasso CV

0.7800

0.5661

0.5000

0.1828

0.7600

0.5758

Average

0.7900

0.6407

0.4940

0.0351

0.7500

0.4982

PRJNA554535

5 of 9 features (REFS)

SelectKbest (4 of 9 features)

10-time random selection

Classifier

AUC

MCC

AUC

MCC

AUC

MCC

AdaBoostClassifier

0.8200

0.6090

0.5260

0.5800

0.8000

0.0525

Extra Trees

0.8500

0.6504

0.5310

0.6093

0.8000

0.0684

KNeighbors

0.6700

0.3840

0.5230

0.4984

0.7300

0.0374

MLP

0.7100

0.4765

0.5230

0.3952

0.6000

0.0600

Lasso CV

0.5200

− 0.0146

0.5160

− 0.0158

0.5100

0.0296

Average

0.7140

0.4210

0.5238

0.4134

0.6880

0.0496

PRJEB53017

5 of 9 features (REFS)

SelectKbest (4 of 9 features)

10-time random selection

Classifier

AUC

MCC

AUC

MCC

AUC

MCC

AdaBoostClassifier

0.6700

0.3036

0.5200

0.0425

0.5500

0.0517

Extra Trees

0.6900

0.3659

0.5230

0.2526

0.6000

0.0550

KNeighbors

0.6800

0.4124

0.4970

0.2977

0.6200

−0.0164

MLP

0.6600

0.3823

0.5270

0.0711

0.5400

0.0741

Lasso CV

0.6100

0.2505

0.5100

0.2035

0.6000

0.0189

Average

0.6620

0.3429

0.5154

0.1734

0.5820

0.0366

  1. *Discovery dataset