Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Barcode identification for single cell genomics

Fig. 1

A strategy to use k-mer counting to identify sequence barcodes. a Circularizing barcodes ensures robustness against single mismatches. An example sequence ‘BARCODE’ contains an error (highlighted in red). When the barcode sequence is short relative to k, all k-mers from this sequence will contain the mutated base. Circularizing the sequence (bottom) ensures that there will be some error-free k-mers from a sequence independent of the position of the error. b An example circular k-mer graph containing one barcode. Error-containing reads were simulated from a ground-truth barcode. Reads were circularized and k-mers were counted. The resultant k-mer graph is plotted here. Nodes in this graph are represented as gray dots, and edges as blue lines. Edges weights are represented by shading (dark = high edge weight). Despite a fairly high rate of error (Poisson 3 errors per 12 nucleotide barcode), the true barcode path is visually discernable with a modest number of reads. c An example circular k-mer graph containing three barcodes. Same as above

Back to article page