Skip to main content

Table 1 Numbers of heterozygous positions correctly identified, and numbers of false-positive predictions from low-copy-number repeats, expressed as a percentage of the identified SNPs.

From: Calling SNPs without a reference sequence

λ\x 4 5 6 7 8
0.5 473/54.8% 552/65.0% 561/68.3% 561/69.0% 561/69.1%
1.0 4,598/28.6% 6,131/37.9% 6,450/43.5% 6,501/45.9% 6,508/46.7%
1.5 14,119/14.9% 21,179/21.4% 23,386/26.8% 23,915/30.3% 24,021/32.0%
2.0 27,067/7.8% 45,111/11.8% 52,630/16.0% 55,036/19.5% 55,675/21.8%
2.5 40,080/4.1% 73,481/6.5% 90,877/9.4% 97,835/12.2% 100,145/14.6%
  1. The values are given as a function of fold coverage (λ, row labels) and the upper bound on the number of overlapping reads (x, column labels). For instance, at λ = 1.0 and x = 5, there are 6,131 correct SNP calls and 37.9% as many duplication-induced erroneous ones. This is a theoretical analysis based on informally fitting a model (see text) to data from the genome of Dr. James Watson.