Skip to main content

Table 1 Compilation of training and test datasets.

From: Data mining of enzymes using specific peptides

Dataset

Selection Criteria from Swiss-Prot

Number of Proteins (and SPs)

Precision

Recall

Training set #1

Single EC annotation and Date-Integrated before 7/1/2006

89,854

(#SPs = 87,017)

100%

85%

"Enzyme Test Set"

EC annotation and Date-Integrated between 7/1/2006 and 7/1/2008

24,443

98%

70%

"Ten Organism Test-Set"

EC annotation and Date-Integrated between 7/1/2006 and 7/1/2008 and all non-enzymes before 7/1/2008

4,509

98%

76%

Training set #2

Single EC annotation and Date-Integrated before 7/27/2009

201,169

(#SPs = 312,465)

100%

94%