Skip to main content

Table 3 Summary of algorithm performance on JGI FAMeS data.

From: Unsupervised statistical clustering of environmental shotgun sequences

FAMeS identifiers min D3 Fragment count Fragment length Accuracy
APOW1005, PPD1199, AIBF1022, AHZI1134, AHXO1014 2.3451 500 400 0.87
BCSB1222, ABFI1048, AHYP1295, AKNK1296, AAZH3626 1.9598 500 400 0.69
AHYT1136, AHYI1010, PIT10099, AINZ1029, AHZF1044 1.9314 500 400 0.85
PPD1199, AUNI1013, ABSU1031, AABS2846, AHXO1014 1.8881 500 400 0.89
AOTU1003, BCSB1222, AIOH1083, AIFS1040, AHXX1063 1.8032 500 400 0.86
BCSB1222, VNY1182, AHXF1121, AKNK1296, AHZI1134 1.3563 500 400 0.81
KPY1561, AOTY1222, BAHF1005, POG1025, AAOP1172 1.2429 500 400 0.79
BCSB1222, AADD1003, AUNI1013, KPR1102, AHXO1014 1.1571 500 400 0.87
AICI1287, AAOO1711, AKNK1296, AHXX1063, KPR1102 1.0279 500 400 0.72
AHYT1136, AAWX1070, WBJ1361, AIAI1092, AXBY1147 0.9987 500 400 0.65
AICI1287, AHYT1136, AAWX1070, AADE1259, AINZ1029 0.9856 500 400 0.72
AUSC1572, AHYF1232, AAON1449, AIAX1019, ACBK1133 0.8884 500 400 0.78
Average (12 trials, 5 sources, L = 400) 1.46 500 400 0.79
  1. Random subsets of 5 sources each were selected from the FAMeS simLC dataset, with a genomic fragment divergence, D3, as shown. Fragments were truncated to the indicated length where appropriate. Reads from the dataset were used raw with no trimming.