Skip to main content

Table 1 The 24 protein targets included in this study

From: Enhancing fragment-based protein structure prediction by customising fragment cardinality according to local secondary structure

#

PDB ID

Length

% of helices

% of strands

% of coils

Relative size of the new 9-mer file

Relative size of the new 3-mer file

1

2CI2_I

65

16

21

63

85.8%

51.7%

2

1CTF

68

51

24

25

85.2%

49.7%

3

1DI2

69

46

33

21

89.5%

47.8%

4

1SCJ_B

71

23

39

38

86.5%

49.5%

5

1HZ5

72

30

38

32

79.2%

37.0%

6

1CC8

72

28

35

37

88.2%

55.3%

7

3NZL

73

59

0

41

79.5%

41.4%

8

1DTJ

74

38

26

36

88.4%

51.8%

9

1IG5

75

61

5

34

83.5%

47.6%

10

1OGW

76

25

34

41

81.4%

47.9%

11

1DCJ

81

28

24

48

77.8%

40.5%

12

1TIG

88

32

32

36

50.0%

29.7%

13

1A19

89

43

17

40

92.9%

61.5%

14

1BM8

99

37

27

36

93.9%

58.4%

15

4UBP

100

54

17

29

77.2%

46.5%

16

1IIB

103

55

19

26

94.9%

73.4%

17

1M6T

106

77

0

23

84.8%

50.1%

18

1ACF

125

34

32

34

77.8%

45.1%

19

3CHY

128

45

17

38

73.9%

45.3%

20

2KDL*

56

62

0

38

87.7%

50.2%

21

2LR8*

70

57

0

43

74.0%

44.0%

22

4HLB*

95

28

24

48

78.2%

49.6%

23

2K4V*

125

28

32

40

72.1%

48.4%

24

2KY4*

149

59

1

40

87.9%

54.1%

Average

88.7

42.3

20.7

37.0

82.1%

49.0%

  1. * Not only did those targets prove particularly difficult to predict during CASP, but the Global Distance Test (GDT) of the best decoys (out of 20,000) in standard predictions is significantly lower (~ 50) than the GDT of the other targets of similar length and confirmed by Table ST1, the 5 added CASP targets create a more challenging set: the average GDT of their best decoy out of 20,000 (regardless of the energy score) is below 50 (i.e. 47), which is the threshold usually used to qualify an alignment as ‘good’ [34]. Moreover, as the table shows their associated first, best and average of the best 5 models are particularly poor for all targets except 2LR8