Skip to main content

Advertisement

Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Table 4 Benchmarking FOSTA against the refined Hulsen et al. dataset

From: Automatically extracting functionally equivalent proteins from SwissProt

Protein family Refined (TO) Basic statistics Evaluation statistics
    TP FP TN FN PPV MCC
HBB 2 (9) 2 0 17 0 100.00 1.00
HOX 30 (41) 30 0 3853 0 100.00 1.00
SMm 12 (17) 12 0 22 0 100.00 1.00
SMc 6 (6) 6 0 5 0 100.00 1.00
NR 4 (29) 1 1 327 3 50.00 0.35
All 54 (102) 51 1 4224 3 98.08 0.96
  1. Protein family: the protein family being examined; TO pairings: the number of TO pairs in the Hulsen dataset (including many-to-many orthologous pairings and non-UniProtKB/Swiss-Prot proteins); Refined pairings: the number of one-to-one TO pairings tested after refinement of Hulsen TO dataset; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp)