Figure 2From: Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rateSimilarities between 10 million random subsequences of the Mus musculus DNA database and barcode sets of different sizes and barcode lengths. (A) Distribution of minimal distances f(δ) between 150 [8,3] barcodes and subsequences. The separation by a naive threshold δ t =1 is illustrated by a vertical dashed line. (B) Falsely detected subsequences as proportion of all tested subsequences based on a threshold of δ t =d S L (barcode,subsequence)≤1 for different sizes of the barcode set and barcodes of length 10 nt, 11 nt and 12 nt (C) Falsely detected subsequences as proportion of all tested subsequences based on a threshold of δ t =d S L (barcode,subsequence)≤1 for different barcode lengths.Back to article page