Skip to main content

A short tutorial in analyzing NGS data of cancer genomes for somatic mutation calling


Somatic mutation is the key element of tumorigenesis as these changes in nucleotide sequence of the cancer genome in somatic cells acquired throughout life can lead to protein alteration, cellular damage and thus cause cancer. The advent of next generation sequencing has significantly improved our ability to identify somatic mutations in cancer genomes paving the way for the comprehensive online catalogue for somatic mutation in human cancer (COSMIC)1 which contains more than 820,000 mutations so far. Nevertheless, there are still many challenges in detecting somatic mutation in cancer especially for low frequency mutation due to either tumor heterogeneity or contamination with normal cells. Here, in this short tutorial, I will present a recent somatic mutation caller tool developed by the Broad Institute called Mutect2 as part of the GATK (Genome Analysis Toolkit). I will use my own NGS dataset to demonstrate the tools and address some issues of troubleshooting input data and interpreting output.

Materials and methods

1. MuTect can be downloaded here: You must register an account with Broad Institute to download.

2. Instruction for running MuTect:

3. User support forum:

4. Because this workshop uses real world NGS data set, it’s not required for participants to follow the walkthrough. However, participants are encouraged to run MuTect with their own data later (whole genome sequencing, whole exome sequencing, RNA sequencing data, etc.)


  1. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, Jia M, Shepherd R, Leung K, Menzies A, Teague JW, Campbell PJ, Stratton MR, Futreal PA: COSMIC: Mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 2009, 39 (Database issue): D945-

    Google Scholar 

  2. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G: Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnology. 2013, 31 (3): 213-219. 10.1038/nbt.2514.

    Article  CAS  Google Scholar 

  3. DePristo M, Banks E, Poplin R, Garimella K, Maguire J, Hartl C, Philippakis A, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell T, Kernytsky A, Sivachenko A, Cibulskis K, Gabriel S, Altshuler D, Daly M: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics. 2011, 43 (5): 491-498. 10.1038/ng.806.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Zhongming Zhao.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Vuong, H., Zhao, Z. A short tutorial in analyzing NGS data of cancer genomes for somatic mutation calling. BMC Bioinformatics 14 (Suppl 17), A15 (2013).

Download citation

  • Published:

  • DOI: