Skip to main content

Table 4 REGS352 and PDB145 datasets

From: Improving classification in protein structure databases using text mining

Superfamily Sequences REGS352 Families REGS352 References PDB145 Families PDB145 References
Amidohydrolase 87 26 73 11 41
Crotonase 50 16 36 7 14
Enolase 85 9 66 8 39
Haloacid Dehalogenase 104 19 93 10 30
VOC 95 17 84 7 21
TOTAL 421 87 352 43 145
  1. Distribution of the gold dataset sequences and the derived datasets REGS352 and PDB145 among the five superfamilies of the gold dataset. (Brown et al ., 2006)