Eq. (

5): The indicative probability

*PI* is the probability that positive pools are more than the sum of unresolved negative samples and real positive samples. If,

*N*
_{
p
} is the number of positive pools,

is the number of unresolved negative samples, and

*d* is the number of positive samples, then

*PI* can be written as A(1).

where *p_min* and *p_max* are the minimum and maximum number of positive pools, respectively.

Because

*N*
_{
p
}=

*i* indicates that there are

*t* -

*i* negative pools,

*P*(

*N*
_{
p
}=

*i*) can be formulated as A(2). Because

=

,

can be formulated as A(3). After integrating A(1)–A(3),

*PI* can be formulated as A(4).

Eq. (

7) and Eq. (

9): These equations define the probabilities that

*N*
_{
v
} reads containing variants are observed in a negative pool (

*P*
_{
nv
}(

*N*
_{
v
})), and

*N*
_{
n
} reads without variants are observed in a negative pool (

*P*
_{
nn
}(

*N*
_{
n
})), respectively. Briefly,

*P*
_{
nv
}(

*N*
_{
v
}) can be written as A(5).

where

*P*(

*i*) is the probability that

*i* reads are obtained, and

*P*
_{
e
}(

*Nv*|

*i*) is the probability that

*N*
_{
v
} errors occur among these

*i* reads. Because the depth follows a negative binomial distribution and sequencing errors follow a binomial distribution, these two probabilities can be formulated as A(6) and A(7). In A(6),

*D* and

*r* are the mean depth of coverage for pooled sequencing and the variance/mean ratio, respectively. In A(7),

*p*
_{
error
} is the mean sequencing error rate.

After integrating A(5)–A(7),

*P*
_{
nv
}(

*N*
_{
v
}) can be formulated as A(8).

The derivation of the formula for

*P*
_{
nn
}(

*N*
_{
n
}) (A(9)) is similar to the derivation for

*P*
_{
nv
}(

*N*
_{
v
}).

Eq. (

8) and Eq. (

10): These equations define the probability that

*N*
_{
v
} reads containing variants are observed in a positive pool (

*P*
_{
pv
}(

*N*
_{
v
})) and

*N*
_{
n
} reads without variants are observed in a positive pool (

*P*
_{
pn
}(

*N*
_{
n
})), respectively. The observations of a variant in a positive pool consist of two parts: real variants from variant chromosomes, and false variants resulting from sequencing errors. Briefly,

*P*
_{
pv
}(

*N*
_{
v
}) can be written as A(10) where

*P*
_{
N
}(

*x*) stands for the probability that

*x* reads containing variants stemming from the sequencing results of normal chromosomes, and

*P*
_{
P
}(

*O* -

*x*) denotes the probability that

*O* -

*x* reads contain variants from variant chromosomes.

By applying a similar procedure to the one used to obtain A(8) and A(9),

*P*
_{
N
}(

*x*) and

*P*
_{
P
}(

*N*
_{
v
} -

*x*) can be formulated as A(11) and A(12). The only difference is the mean sequencing depth of coverage. Because the percentages of variant chromosomes and normal chromosomes are

*p* and 1 -

*p*, respectively, the mean depths of coverage for sequencing variant chromosomes and normal chromosomes are

*pD* and (1 -

*p*)

*D*, respectively.

In the same way,

*P*
_{
pv
}(

*N*
_{
v
}) can be obtained by integrating A(10)–A(12), which is shown as A(13).

Similarly,

*P*
_{
pn
}(

*N*
_{
n
}) can be obtained as shown in A(14).