Skip to main content


Figure 2 | BMC Bioinformatics

Figure 2

From: Agalma: an automated phylogenomics workflow

Figure 2

The number of gene sequences under consideration at each stage of super matrix construction. Most gene sequences are eliminated in the first step of homology evaluation (the parse_edges stage of the homologize pipeline), due to low similarity to other sequences. Of the remaining sequences, many are eliminated during multiple sequence alignment (the multalign pipeline) in clusters failing taxon sampling and cluster size criteria reflecting uncertainty in homology across species, and during orthology evaluation (the treeprune pipeline) mainly reflecting poor sampling of some ortholog groups. See Additional file 2 for further diagnostics regarding matrix construction.

Back to article page