Skip to main content

Table 2 SS algorithm assignment success

From: Identifying structural domains of proteins using clustering

Linkage

metric

m

s

1-domain

2-domain

3-domain

4-domain

Average

midpt

19

5

71% (0.52)

60% (0.35)

47% (0.28)

34% (0.19)

Average

midpt

22

5

75% (0.55)

60% (0.38)

46% (0.31)

34% (0.21)

Average

midpt

25

5

80% (0.56)

58% (0.40)

43% (0.32)

28% (0.20)

Average

midpt

22

3

68% (0.53)

51% (0.29)

41% (0.20)

35% (0.15)

Average

midpt

22

5

75% (0.55)

60% (0.38)

46% (0.31)

34% (0.21)

Average

midpt

22

7

84% (0.53)

55% (0.40)

38% (0.33)

18% (0.18)

Average

closest

22

3

79% (0.55)

51% (0.51)

40% (0.41)

27% (0.27)

Average

closest

22

4

81% (0.54)

52% (0.38)

39% (0.28)

28% (0.19)

Complete

midpt

40

5

71% (0.51)

45% (0.25)

38% (0.20)

22% (0.10)

Complete

midpt

40

7

73% (0.51)

48% (0.28)

38% (0.23)

23% (0.13)

Complete

midpt

40

9

77% (0.50)

50% (0.32)

38% (0.26)

18% (0.12)

Complete

midpt

36

7

67% (0.49)

47% (0.24)

37% (0.19)

24% (0.11)

Complete

midpt

38

7

70% (0.50)

47% (0.26)

38% (0.21)

23% (0.11)

Complete

midpt

42

7

76% (0.51)

49% (0.31)

38% (0.25)

23% (0.14)

  1. Given on the ASTRAL30 data set as a function of m and s. Linkage refers to the clustering technique used in determining the domains. Metric is either midpt, meaning distances between secondary structure elements were taken between their midpoints, or closest, meaning the closest approach distance was used. The optimal combination of m and s are shown in bold for each section of the table. Matthews correlation coefficient is given in parentheses.