Skip to main content

Bioinformatics determination of ETEC signature genes as potential targets for molecular diagnosis and reverse vaccinology


Genomes of the model bacterium, Escherichia coli, exhibit high plasticity caused by gene gain/loss via pathoadaptive mutations, genetic rearrangement, and horizontal gene transfer [1, 2]. This genetic variability is also translated into a remarkable phenotypic and pathotypic diversity: while some E. coli strains normally inhabit the mammalian colon, other pathotypes cause a wide range of intestinal and extraintestinal diseases that include mild intestinal disturbance but also severe urinary tract infections and outbreaks of shigellosis-like dysentery or cholera-like watery diarrhea [1]. In this study, we focus on enterotoxigenic E. coli (ETEC), one of the world's deadliest infectious agents, which also represents a serious public health in Egypt's rural areas. Our aim is to integrate multiple bioinformatics tools to determine horizontally transferred, pathotype-specific signature genes as targets for specific, high-throughput molecular diagnostic tools and reverse vaccinology screens.

Methods and results

To estimate the extent of horizontal gene transfer in ETEC, we used a combination of bioinformatics tools, including GC%, comparative genometrics analysis [3], and web-based prediction of pathogenicity islands via IslandPath[4]. Because E. coli strains are typically polylysogenic [5], we used the ACLAME Prophinder tool[6] to predict complete or rudimentary prophages scattered within the ETEC genome. To determine ETEC pathotype-specific genes or signature genes, we used comparative genomic tools available in the National Microbial Pathogen Data Resource (NMPDR) platform, including the Signature Genes Tool and the Homolog Spreadsheet Tool [7]. We identified 128 genes that differentiate this pathotype from other E. coli strains, based on bidirectional-best-hit signature analysis. We also identified 94 genes that are characteristic to two closely related strains (24377A and 2348/69). Many of the ETEC-specific genes were mapped to prophages, prophage-like elements, and other pathogenicity islands; however, some of these signature genes, e.g., ORFs 21–39 in strain 24377A, seem to be rather lost in other E. coli strains (as they are conserved among other enterobacteria, e.g., Shigella and Salmonella). Our ongoing studies are testing some of these ETEC-specific genes as targets for multiplex PCR amplification to develop a rapid diagnostic typing method. Future studies will analyze the surface-association and antigenicity of these signature gene products as a first step in a reverse vaccinology strategy to develop novel ETEC vaccines.


  1. Dobrindt U: (Patho-)Genomics of Escherichia coli. Int J Med Microbiol 2005, 295(6–7):357–371. 10.1016/j.ijmm.2005.07.009

    Article  CAS  PubMed  Google Scholar 

  2. Morschhauser J, Kohler G, Ziebuhr W, Blum-Oehler G, Dobrindt U, Hacker J: Evolution of microbial pathogens. Philos Trans R Soc Lond B Biol Sci 2000, 355(1397):695–704. 10.1098/rstb.2000.0609

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Roten CA, Gamba P, Barblan JL, Karamata D: Comparative Genometrics (CG): a database dedicated to biometric comparisons of whole genomes. Nucleic Acids Res 2002, 30(1):142–144. 10.1093/nar/30.1.142

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Hsiao W, Wan I, Jones SJ, Brinkman FS: IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics 2003, 19(3):418–420. 10.1093/bioinformatics/btg004

    Article  CAS  PubMed  Google Scholar 

  5. Canchaya C, Proux C, Fournous G, Bruttin A, Brussow H: Prophage genomics. Microbiol Mol Biol Rev 2003, 67(2):238–276. table of contents. table of contents. 10.1128/MMBR.67.2.238-276.2003

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  6. Lima-Mendez G, Van Helden J, Toussaint A, Leplae R: Prophinder: a computational tool for prophage prediction in prokaryotic genomes. Bioinformatics 2008, 24(6):863–865. 10.1093/bioinformatics/btn043

    Article  CAS  PubMed  Google Scholar 

  7. McNeil LK, Reich C, Aziz RK, Bartels D, Cohoon M, Disz T, Edwards RA, Gerdes S, Hwang K, Kubal M, et al.: The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation. Nucleic Acids Res 2007, (35 Database):D347–353. 10.1093/nar/gkl947

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ramy K Aziz.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Amin, H.M., Hashem, AG.M. & Aziz, R.K. Bioinformatics determination of ETEC signature genes as potential targets for molecular diagnosis and reverse vaccinology. BMC Bioinformatics 10 (Suppl 7), A8 (2009).

Download citation

  • Published:

  • DOI: