Skip to main content

Table 3 Number of classification matches at various rates of false positives in the 'borderline' DC1.1993 dataset

From: Improving classification in protein structure databases using text mining

Errors CATH superfamily classification matches (TP)
   TEXT SSAP SSAP + TEXT
False Positive Rate Number of errors Coverage Cutoff Coverage Cutoff Coverage Cutoff
10-5 31 8; 0.04 77.70 16; 0.09 79.94 98; 0.58 0.9808
10-4 306 96; 0.57 48.86 229; 1.36 79.40 585; 3.48 0.6792
10-3 3060 707; 4.21 20.75 1677; 10.00 76.66 2571; 15.33 0.2982
10-2 30598 3036; 18.10 7.83 5808; 34.64 71.24 6901; 41.16 0.0706
  1. Coverage is the fraction of true classification matches and is shown as actual numbers and as a percentage of total TP (%). Scores range between 1 and 100, 30 and 80, and 0 and 1 for the TEXT, SSAP and SSAP + TEXT classifiers, respectively. Total comparisons: 3076606, positive matches: 16765.