Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Improving classification in protein structure databases using text mining

Figure 2

Coverage versus error graphs. (i) Coverage (sensitivity) versus error graph. For each classifier, the scores of the comparisons between the query DC1.1993 'borderline' set and the reference textCATH set were sorted in decreasing order. The comparisons include both CATH superfamily classification matches (true positives TP) and non-matches (false positives FP). Descending from the top classifier score, the numbers of true and false positives are counted for each possible cutoff. Green: TEXT; black: SSAP; blue: SSAP + TEXT logistic regression model. (ii) Log of the fraction of true positives versus the log of the false positive rate (FPR) graph. The FPR is defined as the fraction of the total false positives for each score cutoff. The fraction of TP is the proportion of the total number of TPs (see text).

Back to article page