Skip to main content
  • Oral presentation
  • Open access
  • Published:

High-throughput sequencing of the DBA/2J mouse genome


The DBA/2J mouse is not only the oldest inbred strain, but also one of the most widely used strains. DBA/2J exhibits many unique anatomical, physiological, and behavior traits. In addition, DBA/2J is one parent of the large BXD family of recombinant inbred strains [1]. The genome of the other parent of this BXD family—C57BL/6J—has been sequenced and serves as the mouse reference genome [2]. We sequenced the genome of DBA/2J using SOLiD and Illumina high throughput short read protocols to generate a comprehensive set of ~5 million sequence variants segregating in the BXD family that ultimately cause developmental, anatomical, functional and behavioral differences among these 80+ strains.


We generated approximately 13.2 and 38.9× whole-genome short reads of DBA/2J females using Illumina GA2 and ABI SOLiD massively parallel DNA sequencing platforms. Comparing to the C57BL/6J reference genome sequence, we identified over 4.5 million single nucleotide polymorphisms (SNPs), including 84 nonsense and ~11,000 missense mutations, 78% of which are novel. We also detected ~568,000 insertions and deletions (indels) within single short reads and ~9,400 between mate-paired reads. Approximately 300 inversions were detected by SOLiD mate-pair reads, 46 of which span at least one exon. In addition, we identified ~22,000 copy number variants (CNVs) in the range of 1 Kb to 100 Kb (Figure 1).

Figure 1
figure 1

Concentric circles represent the sequence and structural variation across mouse chromosomes. Moving inward from the outer circle, circle 1 denotes each chromosome. Circle 2, read depth with 100kb window. Circle 3, SNP density with 100kb windows (black is lowest density and orange is highest density). Circle 4, Indels density with 100kb window. Circle 4, Inversion. Circle 5, CNVs, blue (outward) denotes loss of CNVs and green (inward) denotes gains of CNVs.


Our study generates the first consensus sequence for the DBA/2J and creates a compendium of sequence and structural variations that will be used by the community of researchers who study complex traits in mouse models. The sequence data provide a novel resource with which to initiate reverse genetic analysis of complex traits, particularly by exploiting strong alleles (premature stop codons, frame-shift mutations, and deletion) that differentially affect members of the BXD strain family. The DBA/2J genome is also an essential prerequisite to unbiased alignment of RNA-seq and ChIP-seq data generated using BXD strains and any other cross involving these two common parental strains.


  1. Peirce JL, Lu L, Gu J, Silver LM, Williams RW: A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC genetics 2004, 5: 7. 10.1186/1471-2156-5-7

    Article  PubMed Central  PubMed  Google Scholar 

  2. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P: Initial sequencing and comparative analysis of the mouse genome. Nature 2002, 420(6915):520–562. 10.1038/nature01262

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Robert W Williams.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wang, X., Agarwala, R., Capra, J.A. et al. High-throughput sequencing of the DBA/2J mouse genome. BMC Bioinformatics 11 (Suppl 4), O7 (2010).

Download citation

  • Published:

  • DOI: