Skip to main content

Table 1 Real metagenomic data used in this paper

From: MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks

Dataset

# of samples

# of features

# of classes

Classification task

Classification of body sites

Costello et al. (2009) Body Habitat (CBH)

552

1454

6

Classify body habitats: skin (357), oral cavity (46), External Auditory Canal (44), Hair (14), Nostril (46), Feces (45)

Costello et al. (2009) Skin Sites (CSS)

357

600

12

Classify skin sites: external nose (14), forehead (32), glans penis (8), labia minora (6), axilla (28), pinna (27), palm (64), palmar index finger (28), plantar foot (64), popliteal fossa (46), velar forearm (28), umbilicus (12)

Human Microbiome Project (HMP)

1025

323

5

Classify 5 major body sites: anterior nares (269), buccal mucosa (312), stool (319), supragingival plaque (313), tongue dorsum (316)

Classification of subjects

Costello et al. (2009) Subject (CS)

140

464

7

Classify 7 subjects: (20, 20, 20, 20, 20, 20, 20)

Fierer et al. (2010) Subject (FS)

104

294

3

Classify 3 subjects: (40, 33, 31)

Fierer et al. (2010) Subject x Hand (FSH)

98

294

6

Classify by subject and left/right hand: (20, 18, 17, 14, 16, 13)

Classification of disease states

Inflammatory Bowel Disease (IBD)

1025

1025

2

Classify disease states: normal (500), IBD (500)

Pei et al. (2013) Diagnosis (PDX)

200

5955

4

Classify disease states: normal (28), reflux esophagitis (36), Barrett’s esophagus (84), esophageal adenocarcinoma (52)

  1. We consider three different categories of classification aims: body sites, subjects, and disease states. Number of samples for a particular class is included between the round brackets. The number of features equals the number of different OTUs (i.e., microbes)