Copy Number Variation detection from 1000 Genomes project exon capture sequencing data

BMC Bioinformatics

Table 1 Properties of datasets from different sequencing centers

	SC	BCM	BI	WU
Total sample count	117	352	161	93
Sample count after quality control	106	349	110	82
Technology	Illumina	454	Illumina	Illumina
Duplicate rate	0.21	0.30	0.50	0.72
Mapping quality (mean)	50	33	45	51
Base coverage(mean ± standard deviation)	56 ± 34	23 ± 12	70 ± 61	29 ± 9
Read depth per gene(mean ± standard deviation)	2309 ± 3166	106 ± 171	1329 ± 2053	977 ± 1382
MRD(mean ± standard deviation)	1710 ± 1073	97 ± 52	1070 ± 803	599 ± 164
Number of exons	8174	8174	8174	8174
Exons overlapped with segmental duplication regions	458 (5.6%)	458 (5.6%)	458 (5.6%)	458 (5.6%)
Number of genes (passing QC)	862	439	739	1
Genes overlapped with segmental duplication regions	29 (3.3%)	11(2.5%)	23(3.1%)	0(0.0%)
Over-dispersion factor(mean ± standard deviation)	7.9 ± 8.2	2.1 ± 1.1	6.4 ± 5.5	N/A
Quality index(mean ± standard deviation)	9.4 ± 8.8	5.5 ± 2.3	7.6 ± 5.6	N/A
Expected detection sensitivity based on quality index	0.46	0.20	0.41	N/A
Number of calls h = 0.65 either with or without a neighboring call	36	4	56	N/A
Number of calls h = 0.1 either with a neighboring call	17	0	11	N/A

ISSN: 1471-2105