Skip to main content
  • Poster presentation
  • Open access
  • Published:

Phylogenomic inference of functional divergence


The divergence of protein function following gene duplication – or the colonization of new ecological niches – is of central importance in the evolution of novelty. Changes in protein structure and function are reflected at the level of amino acid sequence. This principle suggests that lineage-specific functional divergence in proteins can be identified by the analysis of primary sequence data. However, many amino acid substitutions have a negligible effect on protein function. This means that a simple comparison of the sequence differences between two clusters of proteins will not reveal the subset of changes responsible for functional divergence. While several methods to identify these biologically important substitutions exist [1], they are not optimized for analyses of large numbers of protein sequences. Here, we present a fast new method for identifying these substitutions across a large phylogenetic tree.

Materials and methods

Our method requires a bifurcating phylogenetic tree and a protein sequence alignment. Each node on the tree is defined by two downstream clades and one or more outgroup sequences. Using BLOSUM [2] scores to quantify how radical or conservative substitutions in each clade are relative to the outgroup, we assign a score to each column of the alignment at each tree node, which is then tested for significance [3]. Here, we apply our method to a tree of the GroEL genes from 622 bacterial genomes.


GroEL is an important molecular chaperone which helps at least 250 client proteins fold in Escherichia coli [4]. Interestingly, we found that four out of the five bacterial lineages most enriched for functional divergence are intracellular pathogens (see Figure 1). Radical change in GroEL has previously been implicated in the adaptation of endosymbiotic bacteria to intracellular life [5], and these results suggest this may be a more general response to the population-genetic conditions of an intracellular lifestyle.

Figure 1
figure 1

Bacterial lineages enriched for functional divergence in GroEL. The thermosome-related sequences are found in certain extremophilic bacteria, perhaps as a result of horizontal gene transfer from archaea. The other highlighted lineages are intracellular pathogens, with the exception of Chloroflexi. The tree was produced by RAxML [6].


  1. Gu X, Velden K: DIVERGE: Phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 2002, 18: 500–501. 10.1093/bioinformatics/18.3.500

    Article  CAS  PubMed  Google Scholar 

  2. Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 1992, 89: 10915–10919. 10.1073/pnas.89.22.10915

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Toft C, Williams TA, Fares MA: Genome-wide functional divergence after the symbiosis of Proteobacteria with Insects unraveled through a novel computational approach. PLoS Comput Biol 2009, 5: e1000344. 10.1371/journal.pcbi.1000344

    Article  PubMed Central  PubMed  Google Scholar 

  4. Kerner MJ, Naylor DJ, Ishihama Y, Maier T, Chang H-C, Stines AP, Georgopoulos C, Frishman D, Hayer-Hartl M, Mann M, Hartl FU: Proteome-wide analysis of chaperonin-dependent protein folding in Escherichia coli . Cell 2005, 122: 209–220. 10.1016/j.cell.2005.05.028

    Article  CAS  PubMed  Google Scholar 

  5. Fares MA, Moya A, Barrio E: GroEL and the maintenance of bacterial endosymbiosis. Trends Genet 2004, 20: 413–6. 10.1016/j.tig.2004.07.001

    Article  CAS  PubMed  Google Scholar 

  6. Stamatakis A: RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22: 2688–2690. 10.1093/bioinformatics/btl446

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Williams, T.A., Caffrey, B.E., Jiang, X. et al. Phylogenomic inference of functional divergence. BMC Bioinformatics 10 (Suppl 13), P4 (2009).

Download citation

  • Published:

  • DOI: