Skip to main content

Table 2 19 metrics evaluated for their ability to identify functionally-related genes from knock-down phenotypes. See methods for mathematical definitions of each metric.

From: Information-based methods for predicting gene function from systematic gene knock-downs

Metric

Typea

Description

MatchPresent

P

Counts the number of matching present phenotypes.

MatchAbsent

A

Counts the number of matching absent phenotypes.

Match

P, A

Counts the number of matching present and absent phenotypes.

Pearson Correlation Coefficient (PCC)

P, A

Vector correlation coefficient.

Uncentered Pearson Correlation (UPC)

P, A

Same as PCC, with vector means set to 0. Used for network construction in [9].

UPC 2+

P, A

Same as UPC, restricting to gene pairs sharing two or more phenotypes.

Mutual Information (MI)

P, A

Measures the degree to which knowledge about one gene's phenotypes reduces the entropy of another's.

Euclidean Distance

P, A

The "straight line" distance between two vectors.

Jaccard Index

P, A

The number of matching present phenotypes divided by the number of phenotypes present in either gene.

Frequency Dot Product (FDP)

P, F

Scales the number of matching present phenotypes by the frequency of each phenotype.

Normalized FDP (nFDP)

P, F

Same as FDP, but normalized by the lengths of the phenotypic signature vectors.

Residual FDP (rFDP)

P, F

Same as FDP, scaled by a score obtained by drawing random phenotypes from a Poisson distribution.

Symmetric FDP (sFDP)

P, A, F

Same as FDP, but rewards for matching absent phenotypes as well as matching present phenotypes.

The PhenoBlast Metric

P, A, F

Ranking system used by PhenoBlast (Gunsalus et al. 2004). First ranks gene pairs by MatchPresent, then by MatchAbsent, then by a metric similar to FDP.

Agreement Score (AGREE)

P, A, F

Scales the number of matching present and absent phenotypes by their frequencies across all genes [8].

Weighted MatchPresent (wMatchPresent)

P, C

Same as MatchPresent, but incorporates weights.

Pairwise FDP (pFDP)

P, F, C

Weights by present phenotype background pairwise co-occurences.

Weighted FDP (wFDP)

P, F, C

Same as FDP, but incorporates weights.

Weighted AGREE (wAGREE)

P, A, F, C

Same as AGREE, but incorporates weights.

  1. a: The metric type indicates whether the metric rewards for shared present phenotypes (P), rewards for shared absent phenotypes (A), factors in frequencies of phenotypes across all genes (F), and/or factors in pairwise co-occurrence of phenotypes across all genes (C).