Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells

Fig. 5

Predictability of sequence error frequency allows for detection of spurious barcodes. a Example of zoom-in of read numbers of potential mother and daughter sequences plotted against each other. Each dot represents one (half)-sample, dashed black line denotes the prediction based on total frequencies of mother and daughter barcodes, solid lines denote 95 % confidence band when assuming that errors are described by a binomial distribution (red) or a beta-binomial distribution (green). b ‘Log-likelihood score’ of presumed mother-daughter pairs as identified by visual inspection, as a function of the total read number of the daughter barcode. Each dot denotes one pair and its color denotes their number of nucleotide differences. Dashed line represents the threshold above which pairs are subsequently considered correct. c Result of cleaning procedure on data with different dilutions of 19 known barcode clones (expected cell numbers per technical replicate denoted above panels). Dots represent read number in each of the two replicates, colors denote whether the barcode was a true positive, a true negative, or a false positive. Note that there are no false negatives in this simple data set. Dashed horizontal and vertical lines and numbers alongside denote the approximate number of cells to which the normalized read numbers correspond

Back to article page