Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Batch effect detection and correction in RNA-seq data using machine-learning-based automated assessment of quality

Fig. 1

Workflow. Black boxes show components of the overall workflow. DeriveFeatures is a component that uses four bioinformatic tools to derive the four feature sets from the FASTQ files (.fastq): RAW, MAP, LOC, TSS. seqQscorer computes Plow, the probability of a sample to be of low quality. We used seqQscorer’s generic model, which is derived from 2642 labeled samples and uses a random forest as classification algorithm. We used the salmon tool to quantify gene expression and DESeq2 for rlog normalization [19, 20]

Back to article page