Skip to main content

Table 1 Results for exact and approximate tag sequence matching

From: TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

Tag sequence Library Reads matching with # Mismatches
   0 1 2 3 4 5 > 5
5'-end LIB019 38,271
(89.37)
2,253
(5.26)
564
(1.32)
185
(0.43)
50
(0.12)
64
(0.15)
1,438
(3.36)
  LIB020 14,491
(84.60)
1,629
(9.51)
430
(2.51)
165
(0.96)
31
(0.18)
24
(0.14)
359
(2.10)
  LIB021 41,764
(84.74)
4,748
(9.63)
1,345
(2.73)
427
(0.87)
125
(0.25)
111
(0.23)
762
(1.55)
3'-end LIB019 7,194
(16.80)
12,156
(28.39)
2,454
(5.73)
688
(1.61)
683
(1.59)
766
(1.79)
18,884
(44.10)
  LIB020 2,855
(16.67)
2,460
(14.36)
561
(3.28)
279
(1.63)
275
(1.61)
904
(5.28)
9,795
(57.18)
  LIB021 7,981
(16.19)
6,924
(14.05)
1,800
(3.65)
942
(1.91)
908
(1.84)
2,480
(5.03)
28,247
(57.32)
Concatenated LIB019 931
(2.17)
282
(0.66)
132
(0.31)
51
(0.12)
104
(0.24)
32
(0.07)
-
  LIB020 185
(1.08)
45
(0.26)
19
(0.11)
12
(0.07)
17
(0.10)
8
(0.05)
-
  LIB021 1,302
(2.64)
464
(0.94)
215
(0.44)
120
(0.24)
135
(0.27)
30
(0.06)
-
  1. Results for the 5'-end tag sequence (5'-GTG GTG TGT TGG GTG TGT TTG GNN NNN NNN N; Length: 31 bp; matching within 46 bp), 3'-end tag sequence (NNN NNN NNN CCA AAC ACA CCC AAC ACA CCA-3'; Length: 30 bp; matching within 45 bp) and the concatenated tag sequences (Length: 61 bp). Note that the numbers are based on the dereplicated datasets. Percentages are shown in parenthesis.