PCOGR allows detecting group-specific proteins by both ranking all COGs and graphically showing their distribution over their specificity indexes. The graphical representations can be interpreted as follows: If the two predefined groups are rather related, one expects a single peak in the middle of the graph, i. e. there are little or no proteins specific to one of the groups resulting in a specificity value of around 0.5 for most COGs. In contrast, if the two groups are rather distant, further maxima, either on the left, the right or on both sides become visible, i. e. there are group-specific proteins with S-values around 1 and/or S-values around 0. Even two single organisms can be compared by assigning the first to groupA, the second to groupB and ignoring all other organisms. For instance comparing the closely related Escherichia coli strains O157:H7 EDL933 and O157:H7 results in a prominent single peak in the middle of the graph whereas two further peaks on the edges become visible if two more distant organisms e. g. Aquifex aeolicus and Saccharomyces cerevisiae are compared.
Distance and relationship may be interpreted either in phylogenetic or in physiologic terms. To demonstrate that physiologic relevant differences in protein distributions indeed can be detected by PCOGR, two parameter-presets are selectable: (i) a specificity ranking of hyperthermophile-specific versus non-thermophile-specific proteins as published by Makarova et al.  and of thermophile-specific versus non-thermophile-specific proteins as described by Klinger et al. . For the ranking according to Makarova et al., optimum growth temperatures of corresponding organisms belonging to groupA are all above 80°C and all other organisms are assigned to groupB. For the specificity ranking according to Klinger et al., the optimum growth temperature needed for an organism to be assigned to groupA is above 55°C instead of 80°C. The user will notice that for the two presets, there are two additional peaks, the first corresponding to COGs containing (hyper)thermophile-specific proteins, and the second peak corresponding to COGs containing mesophile-specific proteins.
A further attractive potential of PCOGR lies in the easy way to detect novel protein-protein interactions since physically interacting proteins should phylogenetically similarly be distributed . Thus, if the phylogenetic pattern for a putative interacting protein target is known, a ranking with this pattern as the input will result in a ranking of potentially interacting candidates. To simplify such a procedure, the phylogenetic pattern of a certain COG defined by the user can automatically be assigned as the preset of a subsequent ranking. As an example, we performed a ranking choosing the phylogenetic pattern of COG2025 (electron transfer flavoprotein, alpha subunit). This ranking resulted in only two high-scoring outputs (specificity value S = 1): COG2025 (the target) and COG2086 (electron transfer flavoprotein, beta subunit) which is shown by x-ray crystallography to build a complex with the alpha subunit . All following proteins have specificity values below 0.9 indicating the suitability of such a search for protein-protein interactions.
Not only protein-protein interactions can be detected but also enzymes involved in the same biochemical pathway as a certain target enzyme . This possibility may be useful to find the biochemical function of yet uncharacterized proteins given that one or more catalysts of the same pathway are already characterized. For example, a search performed with the phylogenetic pattern of COG0135 (phosphoribosylanthranilate isomerase), an enzyme involved in the biosynthesis of L-tryptophan, results in four (COG0135, COG0159, COG0547, and COG0134) of the five enzymes involved in tryptophan biosynthesis at the top four places of the ranking. The beta subunit of tryptophan synthase is the only missing enzyme also involved in this pathway. A closer look reveals that this protein is assigned to two instead of one COGs (COG0133: rank 29 and COG1350: rank 1770). The latter COG is annotated as "predicted alternative tryptophan synthase beta-subunit (paralog of TrpB)". This double assignment may explain the absence of the beta subunit of tryptophan synthase from high-scoring proteins of the ranking.
Another attractive use of PCOGR can be to look for an alternative enzyme form catalyzing the same reaction but originated by non orthologous gene displacement (NOGD). Occurrence of NOGD in essential functions can be explored systematically by detecting complementary, rather than identical or similar, phylogenetic patterns . A ranking performed with COG0588 (phosphoglycerate mutase 1) indeed resulted in COG3635 (predicted phosphoglycerate mutase, AP superfamily) at the seventh last rank (rank 4867 out of 4873) demonstrating that PCOGR is also well suited for such a purpose.