Figure 3From: PhyloPattern: regular expressions to identify complex patterns in phylogenetic treesTwo phylogenetic trees are shown, which were built from the same protein sequence alignment, but that correspond to two different domains from the Pfam database [21]: TBC and UCH. This example has already been described in the literature [24] and is used here as a benchmark for PhyloPattern. The results of each step in PhyloPattern (based on the strategy described in Pattern Matching) are shown. Step 1, Annotation: the full domain architecture is given for each sequence. Domain architectures for internal nodes are computed with the Dollo parsimony algorithm [18]. Step 2, Pattern Matching: the pattern shown above each tree is used to detect "parent" nodes of a shuffling event resulting in the architecture TBC-UCH (indicated in yellow). Step 3, Annotation: a purple circle is placed on the derived branch to locate the event. Step 4, Pattern Matching: A simple pattern is applied to extract leaves from each derived subtree. The human sequence ENSP305473 is common to the two subtrees and can be used as a reference for a subsequent genomic comparison with sequences having the "parent" architecture. Labeled trees for TBC and UCH domains are provided [see Additional file 4], [see Additional file 5].Back to article page