Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate

Figure 1

Overview of the approach. (1) A multiplexed sequencing experiment is conducted on the PacBio SMRT platform (2) The similarity between the obtained reads and the used barcode sequences is calculated. We show it as a histogram of distances. (3) We simulate orphaned reads and barcoded reads. The input to the orphaned reads simulation are fragments of the empirical reads. Input to the barcoded reads simulation are known barcode sequences attached to reference sequences. (4) Simulations are repeated for different parameter combinations. We modify parameters until the simulated data closely matches the empirical data. (5) The false discovery rate is estimated from the proportions of barcoded and orphaned reads for each possible distance value. (6) A satisfying false discovery rate (e.g., 0.05) is used to choose a threshold for the highest acceptable dissimilarity between reads and the barcode sequences. All reads with a higher distance to the used barcodes are discarded. (7) Reads are matched with their original samples (de-multiplexing).

Back to article page