Skip to main content

Table 3 Statistical assessment of designed sequences.

From: Evaluating the accuracy of protein design using native secondary sub-structures

PDB ID_Chain E pot (a) χ 2 statistic (b)
Designed Reference Protein-like Bunched Designed Reference Uniprot-distributed
1ZZK_A 21.90 33.03 24.24 295.21 11.06 37.00 27.08
1XTE_A 41.57 42.78 32.98 429.21 23.28 26.14 15.31
1T3Y_A 46.15 40.46 43.67 652.03 22.07 21.70 8.88
1VQS_A 40.54 44.20 29.42 356.21 39.51 29.21 18.77
1OH0_A 45.88 29.85 34.79 438.79 37.73 25.73 12.06
1A2P_A 30.72 27.89 29.88 309.62 29.50 20.66 9.93
1EW4_A 28.98 35.20 37.23 347.37 17.05 36.39 10.05
1HZT_A 38.60 41.60 37.11 604.35 22.24 19.61 12.79
1IDP_A 46.38 37.74 43.54 634.46 50.10 36.94 17.01
1IUJ_A 38.96 31.30 34.49 402.85 32.30 30.23 21.15
1MG4_A 28.03 38.06 36.36 300.93 38.60 13.89 9.59
1NZ0_A 35.95 51.58 48.39 850.27 30.91 55.66 8.16
1URR_A 30.18 28.02 23.55 192.76 21.58 18.81 18.08
1VH5_A 43.58 33.43 37.03 595.33 27.32 15.73 30.65
1VKK_A 40.07 42.42 38.21 612.99 24.51 22.78 13.62
1WLU_A 38.57 40.23 36.69 613.45 40.15 25.25 13.47
1X6Z_A 32.58 44.50 35.98 642.98 27.93 28.63 29.13
1ZHV_A 38.18 67.38 47.61 722.31 22.84 23.84 17.15
2BWF_A 23.42 16.12 21.48 155.17 27.59 18.33 11.07
2FTR_A 49.27 21.01 36.12 290.87 77.06 34.49 15.43
2GPI_A 21.97 26.50 30.76 327.01 17.54 39.97 16.05
2PV2_A 30.48 32.50 32.56 318.45 24.16 13.83 13.01
3EBT_A 63.24 41.44 52.13 643.93 31.17 29.77 22.43
3EF8_A 56.73 46.87 43.06 704.12 36.44 23.44 19.55
3FEA_A 18.96 26.99 25.63 221.59 21.73 28.18 15.01
1GBS_A 72.44 56.16 46.31 1135.2 44.03 31.58 10.53
1R26_A 34.94 23.04 29.40 248.86 17.64 13.40 16.07
1Y25_A 48.29 53.31 52.55 1134.5 26.18 26.00 19.10
2PTH_A 66.33 77.51 58.86 1671.6 33.55 23.49 20.73
1ABA_A 22.67 18.05 19.76 166.69 15.93 16.75 11.94
1DBW_A 35.58 41.21 36.24 526.48 16.34 22.71 16.45
1I2T_A 34.03 34.00 23.55 182.00 20.22 20.07 10.84
1JF8_A 50.42 41.48 31.59 538.69 28.52 22.40 19.64
1KNG_A 47.69 47.33 39.62 777.74 20.34 24.70 13.16
2CAR_A 78.57 71.70 70.30 1603.3 31.39 26.90 13.87
1MF7_A 64.18 63.88 71.41 1550.0 29.86 20.98 16.92
1SHU_X 66.39 52.18 55.33 1426.0 16.93 19.34 15.08
1BKR_A 25.63 30.22 25.95 308.33 14.10 29.46 19.09
2GMY_A 37.04 34.64 35.70 604.25 29.84 16.53 8.61
1OAI_A 21.36 14.58 14.37 76.785 15.64 23.19 12.30
1UTG_A 23.59 22.73 20.83 142.38 33.41 18.75 16.26
1TQG_A 39.34 35.46 44.32 372.97 25.16 22.52 29.04
1TUK_A 15.96 21.39 25.78 190.00 8.20 73.94 26.47
1ZKE_A 52.57 26.50 39.81 282.41 56.29 24.02 16.89
2J5Y_A 19.41 18.76 23.69 156.59 23.08 19.26 22.90
2P5K_A 15.82 20.15 15.64 87.718 12.72 13.78 31.03
1GUT_A 20.80 34.68 33.24 276.11 24.02 18.78 11.73
2O1Q_A 42.20 35.61 42.15 634.86 33.39 35.21 14.52
3I4O_A 16.56 17.85 21.53 151.22 20.91 18.08 22.34
1EAQ_A 35.55 36.74 38.15 525.38 53.59 18.46 17.31
1JB3_A 35.90 38.39 33.32 444.65 25.70 21.64 14.12
1KMT_A 40.62 50.03 36.15 598.09 33.67 15.22 24.35
1KQ1_A 14.84 13.58 18.51 71.494 21.90 15.49 22.93
1NXM_A 61.59 60.02 60.89 1582.6 42.49 29.20 22.84
1O7I_A 35.83 38.47 43.29 571.87 20.76 21.11 19.77
1OK0_A 21.76 20.51 23.92 151.96 36.62 29.67 22.49
1QHQ_A 41.56 66.12 48.61 1052.2 19.43 59.80 15.67
1R6J_A 13.76 22.35 21.93 234.14 7.06 23.50 27.16
1UCS_A 12.26 20.15 19.60 150.73 21.99 29.24 16.83
2C9Q_A 34.82 31.49 33.78 416.12 22.62 33.69 16.02
2F01_A 33.91 32.49 41.94 632.93 44.30 69.50 17.83
2J2J_A 56.97 53.80 54.79 1232.2 29.27 44.44 14.14
2VMH_A 52.37 48.25 65.53 1022.4 48.85 31.09 23.27
3VUB_A 44.39 27.49 24.28 283.47 31.45 19.60 18.99
1M9Z_A 24.13 32.38 39.90 431.24 34.22 100.74 19.58
2J8B_A 22.71 21.00 24.56 264.71 28.96 105.67 14.13
2VOU_A 36.64 45.43 54.36 937.67 35.35 28.20 22.98
1V5I_B 28.93 20.44 28.45 145.74 31.51 11.70 15.60
2WLV_A 40.35 36.52 41.68 586.89 52.72 31.57 10.40
1F46_A 58.98 43.65 38.99 541.85 36.50 14.21 17.10
1VZI_A 33.33 37.45 42.36 578.66 63.84 39.32 18.86
2ANX_A 49.41 53.04 52.31 744.32 20.30 10.30 14.17
2CMP_A 25.96 25.58 31.02 154.14 19.60 18.92 10.19
2CVI_A 27.21 47.82 37.24 298.96 39.62 39.01 23.82
2D3D_A 27.45 28.78 35.66 306.37 18.22 22.52 15.46
2ERB_A 36.41 36.63 36.23 504.87 17.94 52.38 16.72
2O9S_A 16.01 13.43 15.44 100.61 59.57 14.03 9.57
2PR7_A 40.92 48.59 39.60 857.73 28.04 30.25 24.34
2QCP_X 19.48 19.43 22.94 150.64 25.94 18.24 15.44
2V1Q_A 14.77 19.23 19.72 98.525 20.10 18.11 16.45
2VPB_A 23.28 17.56 22.49 121.38 16.62 65.12 9.10
2VZC_A 40.47 40.82 51.30 660.83 16.20 33.51 20.36
2ZXY_A 31.88 25.72 31.14 312.36 24.89 22.00 21.61
3CTG_A 32.67 31.39 40.23 407.36 27.96 11.53 16.02
3E9T_A 29.23 38.88 39.61 488.97 21.08 28.89 21.48
3FIL_A 18.22 24.39 24.34 126.78 21.55 14.81 17.17
3G21_A 25.86 19.68 18.65 156.69 30.02 17.41 17.60
3G36_A 13.81 14.90 11.31 108.60 21.10 11.57 7.88
3IV4_A 35.41 25.83 31.88 388.37 31.85 26.41 9.82
Mean 35.64 35.30 35.58 498.34 28.71 28.16 17.08
Standard deviation 14.56 14.00 12.58 374.65 12.36 16.88 5.43
Quartile 1 23.59 24.39 24.56 221.59 20.76 18.75 13.47
Median 35.41 34.64 35.98 407.36 26.18 23.49 16.45
Quartile 3 42.20 42.78 42.15 634.46 33.55 30.25 20.36
  1. (a) Pot statistic test penalizes short-range bunching of amino acids. The E pot value of reference and protein-like sequences give the minimal bunching. On the other hand, the maximal bunching is obtained from bunched sequences. The E pot values of designed sequences confirm that their bunching is typical of the native sequences. (b) Chi-square test is applied to determine if there is any significant difference between two sets of categorical data. The χ 2 values indicate that the distribution of designed sequences versus Uniprot database is as significant as reference sequences