Skip to main content

Table 3 Statistical assessment of designed sequences.

From: Evaluating the accuracy of protein design using native secondary sub-structures

PDB ID_Chain

E pot (a)

χ 2 statistic (b)

Designed

Reference

Protein-like

Bunched

Designed

Reference

Uniprot-distributed

1ZZK_A

21.90

33.03

24.24

295.21

11.06

37.00

27.08

1XTE_A

41.57

42.78

32.98

429.21

23.28

26.14

15.31

1T3Y_A

46.15

40.46

43.67

652.03

22.07

21.70

8.88

1VQS_A

40.54

44.20

29.42

356.21

39.51

29.21

18.77

1OH0_A

45.88

29.85

34.79

438.79

37.73

25.73

12.06

1A2P_A

30.72

27.89

29.88

309.62

29.50

20.66

9.93

1EW4_A

28.98

35.20

37.23

347.37

17.05

36.39

10.05

1HZT_A

38.60

41.60

37.11

604.35

22.24

19.61

12.79

1IDP_A

46.38

37.74

43.54

634.46

50.10

36.94

17.01

1IUJ_A

38.96

31.30

34.49

402.85

32.30

30.23

21.15

1MG4_A

28.03

38.06

36.36

300.93

38.60

13.89

9.59

1NZ0_A

35.95

51.58

48.39

850.27

30.91

55.66

8.16

1URR_A

30.18

28.02

23.55

192.76

21.58

18.81

18.08

1VH5_A

43.58

33.43

37.03

595.33

27.32

15.73

30.65

1VKK_A

40.07

42.42

38.21

612.99

24.51

22.78

13.62

1WLU_A

38.57

40.23

36.69

613.45

40.15

25.25

13.47

1X6Z_A

32.58

44.50

35.98

642.98

27.93

28.63

29.13

1ZHV_A

38.18

67.38

47.61

722.31

22.84

23.84

17.15

2BWF_A

23.42

16.12

21.48

155.17

27.59

18.33

11.07

2FTR_A

49.27

21.01

36.12

290.87

77.06

34.49

15.43

2GPI_A

21.97

26.50

30.76

327.01

17.54

39.97

16.05

2PV2_A

30.48

32.50

32.56

318.45

24.16

13.83

13.01

3EBT_A

63.24

41.44

52.13

643.93

31.17

29.77

22.43

3EF8_A

56.73

46.87

43.06

704.12

36.44

23.44

19.55

3FEA_A

18.96

26.99

25.63

221.59

21.73

28.18

15.01

1GBS_A

72.44

56.16

46.31

1135.2

44.03

31.58

10.53

1R26_A

34.94

23.04

29.40

248.86

17.64

13.40

16.07

1Y25_A

48.29

53.31

52.55

1134.5

26.18

26.00

19.10

2PTH_A

66.33

77.51

58.86

1671.6

33.55

23.49

20.73

1ABA_A

22.67

18.05

19.76

166.69

15.93

16.75

11.94

1DBW_A

35.58

41.21

36.24

526.48

16.34

22.71

16.45

1I2T_A

34.03

34.00

23.55

182.00

20.22

20.07

10.84

1JF8_A

50.42

41.48

31.59

538.69

28.52

22.40

19.64

1KNG_A

47.69

47.33

39.62

777.74

20.34

24.70

13.16

2CAR_A

78.57

71.70

70.30

1603.3

31.39

26.90

13.87

1MF7_A

64.18

63.88

71.41

1550.0

29.86

20.98

16.92

1SHU_X

66.39

52.18

55.33

1426.0

16.93

19.34

15.08

1BKR_A

25.63

30.22

25.95

308.33

14.10

29.46

19.09

2GMY_A

37.04

34.64

35.70

604.25

29.84

16.53

8.61

1OAI_A

21.36

14.58

14.37

76.785

15.64

23.19

12.30

1UTG_A

23.59

22.73

20.83

142.38

33.41

18.75

16.26

1TQG_A

39.34

35.46

44.32

372.97

25.16

22.52

29.04

1TUK_A

15.96

21.39

25.78

190.00

8.20

73.94

26.47

1ZKE_A

52.57

26.50

39.81

282.41

56.29

24.02

16.89

2J5Y_A

19.41

18.76

23.69

156.59

23.08

19.26

22.90

2P5K_A

15.82

20.15

15.64

87.718

12.72

13.78

31.03

1GUT_A

20.80

34.68

33.24

276.11

24.02

18.78

11.73

2O1Q_A

42.20

35.61

42.15

634.86

33.39

35.21

14.52

3I4O_A

16.56

17.85

21.53

151.22

20.91

18.08

22.34

1EAQ_A

35.55

36.74

38.15

525.38

53.59

18.46

17.31

1JB3_A

35.90

38.39

33.32

444.65

25.70

21.64

14.12

1KMT_A

40.62

50.03

36.15

598.09

33.67

15.22

24.35

1KQ1_A

14.84

13.58

18.51

71.494

21.90

15.49

22.93

1NXM_A

61.59

60.02

60.89

1582.6

42.49

29.20

22.84

1O7I_A

35.83

38.47

43.29

571.87

20.76

21.11

19.77

1OK0_A

21.76

20.51

23.92

151.96

36.62

29.67

22.49

1QHQ_A

41.56

66.12

48.61

1052.2

19.43

59.80

15.67

1R6J_A

13.76

22.35

21.93

234.14

7.06

23.50

27.16

1UCS_A

12.26

20.15

19.60

150.73

21.99

29.24

16.83

2C9Q_A

34.82

31.49

33.78

416.12

22.62

33.69

16.02

2F01_A

33.91

32.49

41.94

632.93

44.30

69.50

17.83

2J2J_A

56.97

53.80

54.79

1232.2

29.27

44.44

14.14

2VMH_A

52.37

48.25

65.53

1022.4

48.85

31.09

23.27

3VUB_A

44.39

27.49

24.28

283.47

31.45

19.60

18.99

1M9Z_A

24.13

32.38

39.90

431.24

34.22

100.74

19.58

2J8B_A

22.71

21.00

24.56

264.71

28.96

105.67

14.13

2VOU_A

36.64

45.43

54.36

937.67

35.35

28.20

22.98

1V5I_B

28.93

20.44

28.45

145.74

31.51

11.70

15.60

2WLV_A

40.35

36.52

41.68

586.89

52.72

31.57

10.40

1F46_A

58.98

43.65

38.99

541.85

36.50

14.21

17.10

1VZI_A

33.33

37.45

42.36

578.66

63.84

39.32

18.86

2ANX_A

49.41

53.04

52.31

744.32

20.30

10.30

14.17

2CMP_A

25.96

25.58

31.02

154.14

19.60

18.92

10.19

2CVI_A

27.21

47.82

37.24

298.96

39.62

39.01

23.82

2D3D_A

27.45

28.78

35.66

306.37

18.22

22.52

15.46

2ERB_A

36.41

36.63

36.23

504.87

17.94

52.38

16.72

2O9S_A

16.01

13.43

15.44

100.61

59.57

14.03

9.57

2PR7_A

40.92

48.59

39.60

857.73

28.04

30.25

24.34

2QCP_X

19.48

19.43

22.94

150.64

25.94

18.24

15.44

2V1Q_A

14.77

19.23

19.72

98.525

20.10

18.11

16.45

2VPB_A

23.28

17.56

22.49

121.38

16.62

65.12

9.10

2VZC_A

40.47

40.82

51.30

660.83

16.20

33.51

20.36

2ZXY_A

31.88

25.72

31.14

312.36

24.89

22.00

21.61

3CTG_A

32.67

31.39

40.23

407.36

27.96

11.53

16.02

3E9T_A

29.23

38.88

39.61

488.97

21.08

28.89

21.48

3FIL_A

18.22

24.39

24.34

126.78

21.55

14.81

17.17

3G21_A

25.86

19.68

18.65

156.69

30.02

17.41

17.60

3G36_A

13.81

14.90

11.31

108.60

21.10

11.57

7.88

3IV4_A

35.41

25.83

31.88

388.37

31.85

26.41

9.82

Mean

35.64

35.30

35.58

498.34

28.71

28.16

17.08

Standard deviation

14.56

14.00

12.58

374.65

12.36

16.88

5.43

Quartile 1

23.59

24.39

24.56

221.59

20.76

18.75

13.47

Median

35.41

34.64

35.98

407.36

26.18

23.49

16.45

Quartile 3

42.20

42.78

42.15

634.46

33.55

30.25

20.36

  1. (a) Pot statistic test penalizes short-range bunching of amino acids. The E pot value of reference and protein-like sequences give the minimal bunching. On the other hand, the maximal bunching is obtained from bunched sequences. The E pot values of designed sequences confirm that their bunching is typical of the native sequences. (b) Chi-square test is applied to determine if there is any significant difference between two sets of categorical data. The χ 2 values indicate that the distribution of designed sequences versus Uniprot database is as significant as reference sequences