Skip to main content

Table 1 Separating samples into processing batches

From: Risk-conscious correction of batch effects: maximising information extraction from high-throughput genomic datasets

Batch 1

Batch 2

Batch 3

Batch 4

A

T1r1 + B1

T2r1 + B2

T3r1 + B3

T4r1 + B4

T1r2 + B1

T2r2 + B2

T3r2 + B3

T4r2 + B4

T1r3 + B1

T2r3 + B1

T3r3 + B3

T4r3 + B4

T1r4 + B1

T2r4 + B2

T3r4 + B3

T4r4 + B4

B

T1r1 + B1

T1r2 + B2

T1r3 + B3

T1r4 + B4

T2r1 + B1

T2r2 + B2

T2r3 + B3

T2r4 + B4

T3r1 + B1

T3r2 + B1

T3r3 + B3

T3r4 + B4

T4r1 + B1

T4r2 + B2

T4r3 + B3

T4r4 + B4

  1. B denotes batch effects, T is treatment and the subscript r is the replicate of that treatment. (A): In this design, each batch consists of one type of treatment. Batch and treatments effects are completely confounded. When we attempt to measure the difference between two treatments, say T1 and T2, what we are actually measuring is (T1-T2) + (B1-B2). Moreover, (B1-B2) is typically likely to be much larger than (T1-T2). (B): This represents the optimal experimental design strategy, where all treatments are distributed equally across all batches. There is no confounding here, but differences between B1, B2, B3 and B4 artificially inflate within-treatment differences, and reduce the power of subsequent statistical tests