Enabling personalised disease diagnosis by combining a patient’s time-specific gene expression profile with a biomedical knowledge base

Table 6 Comparing the performance of SCADDx and LOADDx with existing machine learning algorithms using the single internal validation set approach (\(n=1\) for SCADDx and LOADDx)

Algorithm	Dataset 1			Dataset 2			Dataset 3		Dataset 4		Average accuracy (testsets) (%)
Algorithm	Parameters	Accuracy		Parameters	Accuracy		Parameters	Accuracy	Parameters	Accuracy
		Testset 1a (%)	Testset 1b (%)		Testset 2a (%)	Testset 2b (%)		Testset 3 (%)		Testset 4 (%)
LOADDx (CTD KB)	P = 25, Q = 225	69.23	84.62	P = 50, Q = 300	75	80	P = 25, Q = 25	75	P = 25, Q = 50	85.71	78.26
SCADDx (CTD KB)	P = 100, Q = 175	76.92	84.62	P = 150, Q = 300	100	86.66	P = 25, Q = 25	100	P = 25, Q = 25	85.71	\({\textbf {*88.99}}\)
LOADDx (DisGeNet KB)	P = 275, Q = 50	76.92	92.31	P = 100, Q = 75	93.75	93.33	P = 25, Q = 25	100	P = 25, Q = 25	85.71	\({\textbf {*90.34}}\)
SCADDx (DisGeNet KB)	P = 300, Q = 100	76.92	92.31	P = 75, Q = 100	100	93.33	P = 25, Q = 25	100	P = 25, Q = 25	85.71	\({\textbf {*91.38}}\)
k-NN	K = 11	46.15	61.54	K = 13	87.5	93.33	K=1	75	K=7	57.14	70.11
Random Forest	\(n_{a}\) = 100, \(n_{t}\) = 100	76.92	76.92	\(n_{a}\) = 100, \(n_{t}\) = 100	100	93.33	\(n_{a}\) = 90, \(n_{t}\) = 100	100	\(n_{a}\) = 50, \(n_{t}\) = 100	71.43	86.43
Linear SVM	C = \(2^{-5}\)	76.92	61.54	C = \(2^{-5}\)	100	93.33	C = \(2^{-5}\)	100	C = \(2^{-5}\)	71.43	83.87
SVM with RBF Kernel	\(\sigma\) = \(2^{-15}\), C = \(2^{3}\)	76.92	61.54	\(\sigma\) = \(2^{-15}\), C = \(2^{0}\)	75	86.67	\(\sigma\) = \(2^{-7}\), C = \(2^{-1}\)	100	\(\sigma\) = \(2^{3}\), C = \(2^{0}\)	42.86	73.83
XGBoost (GBTree)	eta = 0.3, \(max\_depth\) = 2 nround = 30	76.92	76.92	eta = 0.3, \(max\_depth\) = 2 nround = 20	100	93.33	eta = 0.8, \(max\_depth\) = 2 nround = 10	100	eta = 0.1, \(max\_depth\) = 2 nround = 90	71.43	86.43

Results in bold denote that they are statistically significant based on the performed t-test
A single asterisk denotes p value < 0.05

ISSN: 1471-2105