- Poster presentation
- Open Access
Phylogenomic inference of functional divergence
BMC Bioinformatics volume 10, Article number: P4 (2009)
The divergence of protein function following gene duplication – or the colonization of new ecological niches – is of central importance in the evolution of novelty. Changes in protein structure and function are reflected at the level of amino acid sequence. This principle suggests that lineage-specific functional divergence in proteins can be identified by the analysis of primary sequence data. However, many amino acid substitutions have a negligible effect on protein function. This means that a simple comparison of the sequence differences between two clusters of proteins will not reveal the subset of changes responsible for functional divergence. While several methods to identify these biologically important substitutions exist , they are not optimized for analyses of large numbers of protein sequences. Here, we present a fast new method for identifying these substitutions across a large phylogenetic tree.
Materials and methods
Our method requires a bifurcating phylogenetic tree and a protein sequence alignment. Each node on the tree is defined by two downstream clades and one or more outgroup sequences. Using BLOSUM  scores to quantify how radical or conservative substitutions in each clade are relative to the outgroup, we assign a score to each column of the alignment at each tree node, which is then tested for significance . Here, we apply our method to a tree of the GroEL genes from 622 bacterial genomes.
GroEL is an important molecular chaperone which helps at least 250 client proteins fold in Escherichia coli . Interestingly, we found that four out of the five bacterial lineages most enriched for functional divergence are intracellular pathogens (see Figure 1). Radical change in GroEL has previously been implicated in the adaptation of endosymbiotic bacteria to intracellular life , and these results suggest this may be a more general response to the population-genetic conditions of an intracellular lifestyle.
Gu X, Velden K: DIVERGE: Phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 2002, 18: 500–501. 10.1093/bioinformatics/18.3.500
Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 1992, 89: 10915–10919. 10.1073/pnas.89.22.10915
Toft C, Williams TA, Fares MA: Genome-wide functional divergence after the symbiosis of Proteobacteria with Insects unraveled through a novel computational approach. PLoS Comput Biol 2009, 5: e1000344. 10.1371/journal.pcbi.1000344
Kerner MJ, Naylor DJ, Ishihama Y, Maier T, Chang H-C, Stines AP, Georgopoulos C, Frishman D, Hayer-Hartl M, Mann M, Hartl FU: Proteome-wide analysis of chaperonin-dependent protein folding in Escherichia coli . Cell 2005, 122: 209–220. 10.1016/j.cell.2005.05.028
Fares MA, Moya A, Barrio E: GroEL and the maintenance of bacterial endosymbiosis. Trends Genet 2004, 20: 413–6. 10.1016/j.tig.2004.07.001
Stamatakis A: RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22: 2688–2690. 10.1093/bioinformatics/btl446
About this article
Cite this article
Williams, T.A., Caffrey, B.E., Jiang, X. et al. Phylogenomic inference of functional divergence. BMC Bioinformatics 10, P4 (2009). https://doi.org/10.1186/1471-2105-10-S13-P4
- Functional Divergence
- Molecular Chaperone
- Bacterial Genome
- Tree Node
- Intracellular Pathogen