Practicality of identifying mitochondria variants from exome and RNAseq data
BMC Bioinformatics volume 16, Article number: P6 (2015)
The rapid progress in high throughput sequencing technology has significantly enriched our capability to study mitochondria genomes. Other than performing mitochondria targeted sequencing, an increasingly popular alternative approach is to utilize the off-target reads from exome sequencing to infer mitochondria genomic variants including SNP and heteroplasmy[1–9]. However, the effectiveness and practicality of such an approach has not been tested. Recently, RNAseq data has also been suggested as good source for alternative data mining[10, 11], but whether mitochondria variants are minable has not been studied.
Materials and methods
We designed a specific study using targeted mitochondria sequencing data as a gold standard to evaluate the practicality of SNP and heteroplasmy detection using exome sequencing and RNAseq data. Five breast cancer cell lines were sequenced for mitochondria targeted sequencing, exome sequencing, and RNAseq. Furthermore, we examined three mitochondria alignment strategies: 1) align all reads directly to the mitochondria genome; 2) align all reads to the nuclear genome and mitochondria genome simultaneously; 3) align all reads to the nuclear genome first, then used the unaligned reads to align to the mitochondria genome.
Our analyses found that exome sequencing can accurately detect mitochondria SNPs and can detect a portion of the true heteroplasmies with a reasonable false discovery rate. RNAseq data on the other hand had a lower detection rate of SNP but higher detection rate for heteroplasmy. However, the higher false discovery rate makes RNAseq a less ideal source for studying mitochondria compared to exome sequencing data. Furthermore, we found that aligning all reads directly to the mitochondria genome reference or aligning all reads to the nuclear genome and mitochondria genome references simultaneously produced the best results.
Exome sequencing and RNAseq data can be potentially mined for mitochondria variants. Overall, exome sequencing provides less false discovery than RNAseq for mitochondria variant detection, making it a more desirable choice. In conclusion, our study provides important guidelines for future studies that intend to use exome sequencing or RNAseq data to infer mitochondria SNP and heteroplasmy.
Samuels DC, Han L, Li J, Quanghu S, Clark TA, Shyr Y, Guo Y: Finding the lost treasures in exome sequencing data. Trends Genet. 2013, 29 (10): 593-599. 10.1016/j.tig.2013.07.006.
Ye F, Samuels DC, Clark T, Guo Y: High-throughput sequencing in mitochondrial DNA research. Mitochondrion. 2014, 17: 157-163.
Picardi E, Pesole G: Mitochondrial genomes gleaned from human whole-exome sequencing. Nature Methods. 2012, 9 (6): 523-524. 10.1038/nmeth.2029.
Guo Y, Li J, Li CI, Shyr Y, Samuels DC: MitoSeek: extracting mitochondria information and performing high-throughput mitochondria sequencing analysis. Bioinformatics. 2013, 29 (9): 1210-1211. 10.1093/bioinformatics/btt118.
Dinwiddie DL, Smith LD, Miller NA, Atherton AM, Farrow EG, Strenk ME, Soden SE, Saunders CJ, Kingsmore SF: Diagnosis of mitochondrial disorders by concomitant next-generation sequencing of the exome and mitochondrial genome. Genomics. 2013, 102 (3): 148-156. 10.1016/j.ygeno.2013.04.013.
Falk MJ, Pierce EA, Consugar M, Xie MH, Guadalupe M, Hardy O, Rappaport EF, Wallace DC, LeProust E, Gai XW: Mitochondrial Disease Genetic Diagnostics: Optimized Whole-Exome Analysis for All MitoCarta Nuclear Genes and the Mitochondrial Genome. Discov Med. 2012, 79: 389-U140.
Nemeth AH, Kwasniewska AC, Lise S, Schnekenberg RP, Becker EBE, Bera KD, Shanks ME, Gregory L, Buck D, Cader MZ, Talbot K, De Silva R, Fletcher N, Hastings R, Jayawant S, Morrison PJ, Worth P, Taylor M, Tolmie J, O'Regan M, Consortium UA, Valentine R, Packham E, Evans J, Seller A, Ragoussis J: Next generation sequencing for molecular diagnosis of neurological disorders using ataxias as a model. Brain. 2013, 136: 3106-3118. 10.1093/brain/awt236.
Sevini F, Giuliani C, Vianello D, Giampieri E, Santoro A, Biondi F, Garagnani P, Passarino G, Luiselli D, Capri M, Franceschi C, Salvioli S: mtDNA mutations in human aging and longevity: controversies and new perspectives opened by high-throughput technologies. Exp Gerontol. 2014, 56: 234-244.
McMahon S, LaFramboise T: Mutational patterns in the breast cancer mitochondrial genome, with clinical correlates. Carcinogenesis. 2014, 35 (5): 1046-1054. 10.1093/carcin/bgu012.
Han L, Vickers KC, Samuels DC, Guo Y: Alternative applications for distinct RNA sequencing strategies. Brief Bioinform. 2014, 16 (4): 629-639.
Vickers KC, Roteta LA, Hucheson-Dilks H, Han L, Guo Y: Mining diverse small RNA species in the deep transcriptome. Trends Biochem Sci. 2015, 40 (1): 4-7. 10.1016/j.tibs.2014.10.009.
About this article
Cite this article
Zhang, P., Samuels, D.C., Lehmann, B. et al. Practicality of identifying mitochondria variants from exome and RNAseq data. BMC Bioinformatics 16 (Suppl 15), P6 (2015). https://doi.org/10.1186/1471-2105-16-S15-P6
- False Discovery Rate
- Breast Cancer Cell Line
- RNAseq Data
- Nuclear Genome
- Exome Sequencing