Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Using context to improve protein domain identification

Figure 2

dPUC predicts more domains over a range of FDRs. A. Illustration of the FDR estimation procedure. For each original protein sequence, we make predictions on it and on twenty shuffled sequences concatenated to the original sequence, to allow "real" domains (Y, Z) to boost false predictions on the shuffled sequence (domains V, W, X) when using context. The estimated FDR is the ratio of false predictions per protein to the total number of predictions per protein. In this illustration, FDR ≈ (3/20)/(2) = 7.5%. B. The y-axis is the number of predicted domains per protein ("signal"), while the x-axis is the FDR ("noise"), so better performing methods have higher curves (more signal for a given noise threshold). dPUC (green circles) outperforms all non-context Pfam variations tested and the context method CODD.

Back to article page