Skip to main content

Table 4 Results for the MetaPhyler simulated metagenomic data set (73,086 sequences, 300 bp)

From: A comparative evaluation of sequence classification programs

 

actual

CARMA

MEGAN

MetaPhyler

MG-RAST

percentage of sequence classified

 

93.6

88.2

80.9

29.8

Proteobacteria

47.0

47.6

44.5

48.3

46.7

Firmicutes

21.9

22.2

24.0

21.8

23.1

Actinobacteria

9.7

8.7

8.8

9.1

9.3

Bacteroidetes

4.8

4.5

4.8

4.3

4.4

Cyanobacteria

3.9

3.6

3.8

3.9

3.7

Tenericutes

2.2

2.5

2.7

2.4

2.3

Spirochaetes

1.9

2.4

2.6

2.3

2.2

Chlamydiae

1.3

1.9

2.0

1.8

1.8

Thermotogae

0.9

1.2

1.2

1.1

1.2

Chlorobi

0.9

1.4

1.5

1.3

1.4

percentage of sequence misclassified

 

0.3

0.3

0.3

0.2

correlation coefficient

 

≈ 1.0

≈ 1.0

≈ 1.0

≈ 1.0

  1. The actual distribution of sequences compared to the distribution inferred by the alignment-based programs.