MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks

Lo, Chieh; Marculescu, Radu

doi:10.1186/s12859-019-2833-2

BMC Bioinformatics

Table 5 Performance comparison of SVM, RF and NN models on eight real datasets described in Table 1

From: MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks

Dataset	SVM	SVM+A	RF	RF+A	MLP+D	CNN+D	MLP+D+A	CNN+D+A	Gain (%)
F1-macro
CBH	0.78 (0.03)	0.82 (0.03)	0.73 (0.03)	0.75 (0.03)	0.85 (0.03)	0.77 (0.04)	0.86 (0.03)	0.82 (0.03)	5
CSS	0.63 (0.07)	0.65 (0.06)	0.58 (0.08)	0.61 (0.06)	0.66 (0.06)	0.59 (0.06)	0.67 (0.06)	0.62 (0.06)	3
HMP	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0
CS	0.88 (0.05)	0.88 (0.05)	0.87 (0.05)	0.87 (0.05)	0.92 (0.05)	0.87 (0.06)	0.93 (0.05)	0.88 (0.05)	6
FS	0.94 (0.03)	0.95 (0.02)	1.00 (0.01)	1.00 (0.01)	0.97 (0.03)	0.90 (0.15)	0.98 (0.02)	0.97 (0.02)	-2
FSH	0.68 (0.08)	0.70 (0.08)	0.63 (0.08)	0.68 (0.08)	0.74 (0.06)	0.66 (0.07)	0.74 (0.05)	0.72 (0.07)	6
IBD	0.68 (0.04)	0.72 (0.02)	0.57 (0.02)	0.60 (0.02)	0.75 (0.02)	0.67 (0.03)	0.78 (0.02)	0.70 (0.02)	8
PDX	0.29 (0.13)	0.43 (0.02)	0.28 (0.09)	0.34 (0.07)	0.51 (0.00)	0.44 (0.05)	0.56 (0.03)	0.45 (0.08)	30
F1-micro
CBH	0.93 (0.02)	0.93 (0.01)	0.91 (0.02)	0.92 (0.02)	0.94 (0.01)	0.89 (0.02)	0.94 (0.01)	0.92 (0.02)	1
CSS	0.71 (0.03)	0.72 (0.04)	0.67 (0.03)	0.68 (0.03)	0.72 (0.03)	0.67 (0.04)	0.74 (0.03)	0.68 (0.04)	3
HMP	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.97 (0.01)	0.96 (0.01)	0.97 (0.01)	0.97 (0.01)	0
CS	0.88 (0.06)	0.89 (0.05)	0.88 (0.04)	0.88 (0.05)	0.92 (0.04)	0.87 (0.06)	0.94 (0.04)	0.89 (0.05)	6
FS	0.94 (0.03)	0.95 (0.02)	1.00 (0.01)	1.00 (0.01)	0.97 (0.03)	0.91 (0.12)	0.98 (0.02)	0.97 (0.02)	-2
FSH	0.70 (0.08)	0.71 (0.07)	0.69 (0.05)	0.72 (0.06)	0.75 (0.05)	0.68 (0.06)	0.76 (0.05)	0.75 (0.07)	6
IBD	0.79 (0.02)	0.79 (0.02)	0.78 (0.02)	0.79 (0.02)	0.82 (0.01)	0.77 (0.02)	0.84 (0.01)	0.78 (0.02)	6
PDX	0.44 (0.07)	0.48 (0.03)	0.43 (0.07)	0.44 (0.06)	0.53 (0.01)	0.49 (0.05)	0.56 (0.03)	0.50 (0.06)	17

+D and +A means dropout and data augmentation, respectively. For each experiment, we consider 10-fold cross-validation and use F1-macro and F1-micro scores to quantify performance as defined in Classification performance metrics. For each fold, we perform five simulation runs with standard deviations shown between round brackets. Performance gains are shown for the best NN and the best ML models. Bold values show the best results

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com