Skip to main content

Table 4 REGS352 and PDB145 datasets

From: Improving classification in protein structure databases using text mining

Superfamily

Sequences

REGS352 Families

REGS352 References

PDB145 Families

PDB145 References

Amidohydrolase

87

26

73

11

41

Crotonase

50

16

36

7

14

Enolase

85

9

66

8

39

Haloacid Dehalogenase

104

19

93

10

30

VOC

95

17

84

7

21

TOTAL

421

87

352

43

145

  1. Distribution of the gold dataset sequences and the derived datasets REGS352 and PDB145 among the five superfamilies of the gold dataset. (Brown et al ., 2006)