A permutation-based method to identify loss-of-heterozygosity using paired genotype microarray data
© Pounds et al; licensee BioMed Central Ltd. 2008
Published: 8 July 2008
SNP genotyping microarrays may be used to detect regions of loss-of-heterozygosity (LOH). Genotype array data are collected for tumor tissue and germline tissue samples from each subject. For each subject, an initial call of LOH or non-LOH is generated for each marker via straightforward comparison of the genotype call across each tissue sample pair . The genotype calls are generated with some error. Therefore, statistical models are used to analyze the pattern of LOH calls to infer regions of LOH for each subject .
Materials and methods
We propose call-based segmentation analysis (CBSA) as a permutation-based method to infer regions of LOH from this type of data. Chromosome endpoints and the positions of markers with initial LOH calls are used to divide the genomes of study subjects into a series of distinct segments that are indexed by subject and location. The size of each segment is measured by the number of non-LOH calls it contains.
CBSA performs a permutation test to determine whether a segment has significantly fewer non-LOH calls than expected by chance. Permuting the assignment of initial LOH calls to subject and genomic position generates an empirical null distribution of segment size for computing p-values. In practice, p-values may be computed with a very accurate analytical approximation of the permutation distribution .
Next, the false discovery rate (FDR) is estimated with a robust method . Finally, each segment defined by the observed positions of LOH calls has a size, p-value, and FDR estimate associated with it. Each segment with an FDR estimate below a selected threshold is inferred to be a segment of LOH. Mathematical proofs establish that the FDR estimate is conservative, i.e., the estimated FDR is expected to be greater than the actual FDR .
In our study of LOH in secondary leukemia , we applied CBSA with an estimated FDR of 10%. CBSA showed similar or greater sensitivity than dChip SNP  to detect LOH on each chromosome with one-copy loss according to cytogenetics . Additionally, CBSA was robust against poor quality. After exclusion of two subjects with poor quality data, CBSA inferences were concordant with original CBSA inferences for the remaining eleven subjects at 99.6% of all markers.
CBSA is a practically useful method for detecting LOH. CBSA is conceptually simple, computationally efficient, statistically sound, and robust. Furthermore, CBSA may be a more powerful method than dChip SNP for some studies.
- Lin M, Wei L-J, Seller WR, Lieberfarb M, Wong WH, Li C: dChipSNP: significance curve and clustering of SNP-array based loss-of-heterozygosity data. Bioinformatics 2004, 20: 1233–1240.View ArticlePubMedGoogle Scholar
- Pyke R: Spacings. J Roy Stat Soc B 1965, 27: 395–449.Google Scholar
- Pounds S, Cheng C: Robust estimation of the false discovery rate. Bioinformatics 2006, 22: 1979–1987.View ArticlePubMedGoogle Scholar
- Hartford C, Yang W, Cheng C, Fan Y, Liu W, Trevino L, Pounds S, Neale G, Raimondi SC, Bogni A, Dolan ME, Pui C-H, Relling MV: Genome scan implicates adhesion biological pathways in secondary leukemia. Leukemia 2007, 21: 2128–2136.View ArticlePubMedGoogle Scholar
- Raimondi SC, Mathew S, Pui C-H: Cytogenetics as a diagnostic aid for childhood hematologic disorders: conventional cytogenetic techniques, fluorescence in situ hybridization, and comparative genomic hybridization. In Tumor Marker Protocols. Methods in Molecular Medicine. Edited by: Hanausek M, Walaszek Z. Totowa, NJ: Humana Press; 1998:209–227.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd.