Benchmarking six methods for pairwise comparison of phylogenetic profiles. We compare six methods for ranking pairs of phylogenetic profiles. The first (black) uses the unweighted hypergeometric distribution for the probability of the observed or a greater number of matches between two profiles. The second (red) ranks by mutual information, the entropy of the first profile plus the entropy of the second profile minus the entropy of the joint profile viewed one genome at a time . The third (orange) uses the weighted hypergeometric distribution that considers the occupancy of each genome across all genes. The fourth (yellow) is the same as the third but on a reduced set of organisms. The fifth (green) combines the weighted hypergeometric p-value and a p-value for the observed or a smaller number of runs in the observed matches. Methods are benchmarked against the GO cellular localization and biological process ontologies. The GO p-value for each pair of proteins is the probability for the genes of that pair to share a GO term at least as specific as their most specific shared term, and we compute the cumulative average log10 GO p-value for top pairs as ranked by each metric. Introducing runs into the calculations improves results by tending to yield more significant GO p-values. The inset compares the fifth method (green) to a full tree-based method (blue). Due to the computational difficulty of evaluating Pagel's method, we only compared it to our novel method on a random subset of 100,000 benchmarkable pairs. Each such sampled pair represents approximately 35 pairs in a full all-versus-all run. The average log10 GO p-value over all benchmarkable pairs is approximately -0.40 and is shown in the inset (but lies above the top of the main plot).