Skip to main content

Table 1 Compilation of training and test datasets.

From: Data mining of enzymes using specific peptides

Dataset Selection Criteria from Swiss-Prot Number of Proteins (and SPs) Precision Recall
Training set #1 Single EC annotation and Date-Integrated before 7/1/2006 89,854
(#SPs = 87,017)
100% 85%
"Enzyme Test Set" EC annotation and Date-Integrated between 7/1/2006 and 7/1/2008 24,443 98% 70%
"Ten Organism Test-Set" EC annotation and Date-Integrated between 7/1/2006 and 7/1/2008 and all non-enzymes before 7/1/2008 4,509 98% 76%
Training set #2 Single EC annotation and Date-Integrated before 7/27/2009 201,169
(#SPs = 312,465)
100% 94%