Skip to main content


Reconstructing the virulome of the human pathogen Streptococcus pyogenes using NMPDR subsystems-based annotation


The increasing number of published complete microbial genomes has revolutionized biological sciences and is driving a paradigm shift in microbiology. While this genomic revolution has made the reconstruction of an organism's metabolism from genomic data achievable, predicting the pathogenic potential of host-associated microbes is still in its early stages; hence, developing innovative bioinformatics tools that integrate microbiologists' expertise and experimental laboratory data with sequence data remains a necessity. For this purpose, the NIH-funded National Microbial Pathogen Data Resource (NMPDR, was established as a bioinformatics resource center for specific bacterial pathogens, including staphylococci, streptococci, and sexually transmitted bacteria [1]. Genomes in NMPDR are annotated by the recently developed subsystems annotation technology [2, 3], available from the SEED environment This technology relies on analyzing genes in their chromosomal context and combines the accuracy of human curation with the speed of automated propagation [2].

Methods and results

In this study, we apply the subsystems annotation technology to reconstruct the virulome of the human pathogen Streptococcus pyogenes that claims 500,000 lives every year [4], and causes a wide range of diseases that affect adults and children [5]. In particular, we use NMPDR tools for pathogenomic comparison of the fully sequenced streptococcal serotypes, and highlight the impact of prophages and highly recombinatorial genomic segments, including the newly discovered pilus locus [6], on streptococcal strain emergence and diversification [7]. Our analysis defines the core and dispensable elements of the streptococcal virulome, which includes – in addition to the ancestral, species-specific virulence proteins – the phage-encoded toxins and their pseudogenes. Finally, we use comparative analysis of streptococcal subsystems in context of actual transcriptome data to gain insight into the complex gene regulatory networks that control virulence.


  1. 1.

    McNeil LK, Reich C, Aziz RK, Bartels D, Cohoon M, Disz T, Edwards RA, Gerdes S, Hwang K, Kubal M, et al.: The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation. Nucleic Acids Res 2007, (35 Database):D347–353. 10.1093/nar/gkl947

  2. 2.

    Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, et al.: The RAST Server: rapid annotations using subsystems technology. BMC Genomics 2008, 9: 75. 10.1186/1471-2164-9-75

  3. 3.

    Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, Cohoon M, de Crecy-Lagard V, Diaz N, Disz T, Edwards R, et al.: The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 2005, 33(17):5691–5702. 10.1093/nar/gki866

  4. 4.

    Carapetis JR, Steer AC, Mulholland EK, Weber M: The global burden of group A streptococcal diseases. Lancet Infect Dis 2005, 5(11):685–694. 10.1016/S1473-3099(05)70267-X

  5. 5.

    Cunningham MW: Pathogenesis of group A streptococcal infections. Clin Microbiol Rev 2000, 13(3):470–511. 10.1128/CMR.13.3.470-511.2000

  6. 6.

    Bessen DE, Kalia A: Genomic localization of a T serotype locus to a recombinatorial zone encoding extracellular matrix-binding proteins in Streptococcus pyogenes . Infect Immun 2002, 70(3):1159–1167. 10.1128/IAI.70.3.1159-1167.2002

  7. 7.

    Aziz RK, Kotb M: Rise and persistence of global M1T1 clone of Streptococcus pyogenes. Emerg Infect Dis 2008, 14(10):1511–1517. 10.3201/eid1410.071660

Download references

Author information

Correspondence to Ramy K Aziz.

Rights and permissions

Reprints and Permissions

About this article


  • Gene Regulatory Network
  • Strain Emergence
  • Bioinformatics Resource
  • Genomic Revolution
  • Streptococcal Strain