Skip to main content

Table 1 Performance (mean and SD) of the ML methods.

From: A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

Model

Learner

Noise

Feature Selection

<Sens>

SD(Sens)

<Spec>

SD(Spec)

<Q>

SD(Q)

S1

ANN

0

None

0.71

0.12

0.96

0.038

0.84

0.068

S1

SVM

0

None

0.68

0.076

0.99

0.0063

0.83

0.039

S2

SVM

0

None

0.66

0.086

0.99

0.0095

0.82

0.041

S2

NB

0

None

0.63

0.13

0.98

0.0072

0.81

0.063

S1

NB

0

None

0.62

0.096

0.99

0.0049

0.8

0.047

S1

KNN(k = 3)

0

None

0.58

0.13

0.98

0.013

0.78

0.067

S2

ANN

0

None

0.56

0.19

0.98

0.008

0.77

0.091

S1

CART

0

None

0.56

0.13

0.96

0.018

0.76

0.065

S2

KNN(k = 3)

0

None

0.47

0.16

0.98

0.014

0.73

0.077

S2

LDA

0

None

0.7

0.11

0.76

0.051

0.73

0.051

S1

KNN(k = 5)

0

None

0.45

0.13

0.98

0.014

0.72

0.064

S1

LDA

0

None

0.66

0.13

0.69

0.048

0.67

0.079

S2

KNN(k = 5)

0

None

0.34

0.1

0.99

0.016

0.66

0.049

S2

CART

0

None

NA

NA

NA

NA

NA

NA

S1

SVM

0

T-test

0.75

0.089

0.98

0.011

0.87

0.043

S1

LDA

0

T-test

0.74

0.11

0.99

0.0057

0.86

0.052

S1

ANN

0

T-test

0.73

0.13

0.97

0.023

0.85

0.055

S2

LDA

0

T-test

0.67

0.11

0.98

0.009

0.83

0.054

S2

ANN

0

T-test

0.7

0.11

0.96

0.013

0.83

0.051

S2

NB

0

T-test

0.65

0.13

0.98

0.014

0.81

0.064

S2

SVM

0

T-test

0.64

0.11

0.98

0.0074

0.81

0.057

S1

NB

0

T-test

0.61

0.078

0.97

0.012

0.79

0.039

S1

KNN(k = 3)

0

T-test

0.55

0.16

0.99

0.008

0.77

0.077

S2

KNN(k = 3)

0

T-test

0.52

0.17

0.99

0.0044

0.76

0.082

S2

CART

0

T-test

0.58

0.12

0.94

0.025

0.76

0.061

S1

CART

0

T-test

0.54

0.17

0.96

0.038

0.75

0.07

S1

KNN(k = 5)

0

T-test

0.45

0.12

1

0.0035

0.73

0.062

S2

KNN(k = 5)

0

T-test

0.37

0.13

0.99

0.0056

0.68

0.066

S1

KNN(k = 3)

2

None

0.54

0.11

0.98

0.011

0.76

0.057

S1

ANN

2

None

0.53

0.13

0.97

0.019

0.75

0.066

S1

NB

2

None

0.48

0.13

0.99

0.0054

0.74

0.067

S2

KNN(k = 3)

2

None

0.49

0.13

0.98

0.0091

0.74

0.065

S2

ANN

2

None

0.44

0.11

0.98

0.016

0.71

0.049

S2

NB

2

None

0.43

0.062

0.99

0.0056

0.71

0.029

S1

SVM

2

None

0.41

0.09

1

0.0031

0.7

0.045

S1

CART

2

None

0.4

0.12

0.94

0.041

0.67

0.053

S2

KNN(k = 5)

2

None

0.33

0.087

0.99

0.0066

0.66

0.043

S2

CART

2

None

0.34

0.17

0.96

0.03

0.65

0.08

S1

KNN(k = 5)

2

None

0.3

0.11

0.99

0.0085

0.65

0.054

S2

SVM

2

None

0.3

0.079

1

0.0039

0.65

0.039

S2

LDA

2

None

0.59

0.069

0.72

0.05

0.65

0.038

S1

LDA

2

None

0.6

0.12

0.64

0.037

0.62

0.068

S1

LDA

2

T-test

0.55

0.11

0.98

0.0078

0.77

0.055

S1

SVM

2

T-test

0.5

0.11

0.99

0.004

0.75

0.053

S2

NB

2

T-test

0.53

0.084

0.98

0.01

0.75

0.042

S1

NB

2

T-test

0.52

0.083

0.98

0.0088

0.75

0.042

S2

LDA

2

T-test

0.52

0.087

0.97

0.011

0.75

0.042

S2

SVM

2

T-test

0.48

0.1

0.99

0.0056

0.74

0.052

S1

ANN

2

T-test

0.51

0.1

0.97

0.015

0.74

0.045

S2

KNN(k = 3)

2

T-test

0.48

0.14

0.99

0.0083

0.73

0.07

S1

KNN(k = 3)

2

T-test

0.48

0.1

0.98

0.01

0.73

0.051

S2

ANN

2

T-test

0.48

0.1

0.96

0.01

0.72

0.049

S1

KNN(k = 5)

2

T-test

0.36

0.095

1

0.0049

0.68

0.047

S2

KNN(k = 5)

2

T-test

0.32

0.047

0.99

0.0036

0.66

0.023

S1

CART

2

T-test

0.35

0.14

0.95

0.023

0.65

0.071

S2

CART

2

T-test

0.25

0.093

0.96

0.027

0.6

0.044

  1. This data is compiled for the special case where 300 chemicals were used, as a function of model, feature selection and level of measurement noise. The results are organized into 4 blocks, corresponding to the 4 blocks in Figure 5. Within a block, rows are ordered by decreasing values of Q-Score. The results give the average sensitivity, specificity and Q-score along with their corresponding standard deviations. All ML methods were trained using 300 chemicals. The values come from 10 independent validation runs with unique samples of 300 chemicals. Values of sensitivity, specificity and Q-score > 0.8 are bolded. Rows where the Q-score is less than that of the best Q-score in the block minus one standard deviation for the best row are shaded.