Mechanism-aware imputation: a two-step approach in handling missing values in metabolomics

BMC Bioinformatics

Table 3 Evaluation of the random forest classifier performance in step one of MAI for varying sample size and number of metabolites

Metabolite number and sample size combination	Mean accuracy (%)	Accuracy 95% CI	Mean NRMSE	NRMSE 95% CI
p = 50 n = 50	82.0	[80.1%, 83.5%]	0.260	[0.245, 0.278]
p = 50 n = 100	81.8	[80.3%, 83.2%]	0.264	[0.256, 0.278]
p = 100 n = 50	81.3	[80.2%, 82.4%]	0.282	[0.267, 0.282]
p = 200 n = 400	82.1	[80.0%, 83.3%]	0.259	[0.256, 0.263]
p = 400 n = 200	81.7	[80.0%, 82.7%]	0.272	[0.271, 0.272]
p = 50 n = 400	81.7	[80.2%, 82.9%]	0.270	[0.260, 0.270]
p = 400 n = 50	82.0	[80.1%, 83.3%]	0.273	[0.264, 0.288]
p = 400 n = 20	82.1	[80.1%, 82.9%]	0.239	[0.234, 0.245]

Accuracy metrics with associated 95% confidence intervals (Cis) are reported for different combinations of sample size (n) and number of metabolites (p) from the COPDGene Data Set 1

ISSN: 1471-2105