Skip to main content

Table 6 Comparing the performance of SCADDx and LOADDx with existing machine learning algorithms using the single internal validation set approach (\(n=1\) for SCADDx and LOADDx)

From: Enabling personalised disease diagnosis by combining a patient’s time-specific gene expression profile with a biomedical knowledge base

Algorithm

Dataset 1

Dataset 2

Dataset 3

Dataset 4

Average

accuracy

(testsets) (%)

Parameters

Accuracy

Parameters

Accuracy

Parameters

Accuracy

Parameters

Accuracy

 
  

Testset 1a (%)

Testset 1b (%)

 

Testset 2a (%)

Testset 2b (%)

 

Testset 3 (%)

 

Testset 4 (%)

 

LOADDx

(CTD KB)

P = 25,

Q = 225

69.23

84.62

P = 50,

Q = 300

75

80

P = 25,

Q = 25

75

P = 25,

Q = 50

85.71

78.26

SCADDx

(CTD KB)

P = 100,

Q = 175

76.92

84.62

P = 150,

Q = 300

100

86.66

P = 25,

Q = 25

100

P = 25,

Q = 25

85.71

\({\textbf {*88.99}}\)

LOADDx

(DisGeNet KB)

P = 275,

Q = 50

76.92

92.31

P = 100,

Q = 75

93.75

93.33

P = 25,

Q = 25

100

P = 25,

Q = 25

85.71

\({\textbf {*90.34}}\)

SCADDx

(DisGeNet KB)

P = 300,

Q = 100

76.92

92.31

P = 75,

Q = 100

100

93.33

P = 25,

Q = 25

100

P = 25,

Q = 25

85.71

\({\textbf {*91.38}}\)

k-NN

K = 11

46.15

61.54

K = 13

87.5

93.33

K=1

75

K=7

57.14

70.11

Random

Forest

\(n_{a}\) = 100,

\(n_{t}\) = 100

76.92

76.92

\(n_{a}\) = 100,

\(n_{t}\) = 100

100

93.33

\(n_{a}\) = 90,

\(n_{t}\) = 100

100

\(n_{a}\) = 50,

\(n_{t}\) = 100

71.43

86.43

Linear

SVM

C = \(2^{-5}\)

76.92

61.54

C = \(2^{-5}\)

100

93.33

C = \(2^{-5}\)

100

C = \(2^{-5}\)

71.43

83.87

SVM

with

RBF Kernel

\(\sigma\) = \(2^{-15}\),

C = \(2^{3}\)

76.92

61.54

\(\sigma\) = \(2^{-15}\),

C = \(2^{0}\)

75

86.67

\(\sigma\) = \(2^{-7}\),

C = \(2^{-1}\)

100

\(\sigma\) = \(2^{3}\),

C = \(2^{0}\)

42.86

73.83

XGBoost

(GBTree)

eta = 0.3,

\(max\_depth\) = 2

nround = 30

76.92

76.92

eta = 0.3,

\(max\_depth\) = 2

nround = 20

100

93.33

eta = 0.8,

\(max\_depth\) = 2

nround = 10

100

eta = 0.1,

\(max\_depth\) = 2

nround = 90

71.43

86.43

  1. Results in bold denote that they are statistically significant based on the performed t-test
  2. A single asterisk denotes p value < 0.05