Skip to main content

Table 4 Benchmarking FOSTA against the refined Hulsen et al. dataset

From: Automatically extracting functionally equivalent proteins from SwissProt

Protein family

Refined

(TO)

Basic statistics

Evaluation statistics

   

TP

FP

TN

FN

PPV

MCC

HBB

2

(9)

2

0

17

0

100.00

1.00

HOX

30

(41)

30

0

3853

0

100.00

1.00

SMm

12

(17)

12

0

22

0

100.00

1.00

SMc

6

(6)

6

0

5

0

100.00

1.00

NR

4

(29)

1

1

327

3

50.00

0.35

All

54

(102)

51

1

4224

3

98.08

0.96

  1. Protein family: the protein family being examined; TO pairings: the number of TO pairs in the Hulsen dataset (including many-to-many orthologous pairings and non-UniProtKB/Swiss-Prot proteins); Refined pairings: the number of one-to-one TO pairings tested after refinement of Hulsen TO dataset; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp)