Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods

Table 6 Distribution of amino acid types in data set

Mutated amino acid	In data set		Hot spots (ΔΔG≥ 2 kcal/mol)			Enrichment in hot spots
	(Number)	(%)	(Number)	(%) ^(a)	(%) ^(b)
Arg	33	9.46	7	21.21	8.64	0.91
Asn	22	6.30	6	27.27	7.41	1.18
Asp	29	8.31	9	31.03	11.11	1.34
Cys	1	0.29	0	0.00	0.00	0.00
Gln	21	6.02	2	9.52	2.47	0.41
Glu	31	8.88	5	16.13	6.17	0.69
His	13	3.72	1	7.69	1.23	0.33
Ile	15	4.30	4	26.67	4.94	1.15
Leu	10	2.87	1	10.00	1.23	0.43
Lys	32	9.17	11	34.38	13.58	1.48
Met	2	0.57	0	0.00	0.00	0.00
Phe	11	3.15	2	18.18	2.47	0.78
Ser	28	8.02	1	3.57	1.23	0.15
Thr	24	6.88	1	4.17	1.23	0.18
Trp	23	6.59	9	39.13	11.11	1.69
Tyr	44	12.61	20	45.45	24.69	1.96
Val	10	2.87	2	20.00	2.47	0.86

The number and percentage of amino acids in our data set are shown. The number and percentage of hot spots for each amino acid type is also reported. For a given amino acid type, (%)^(a)is the percentage of hot spots with respect to the residues of that type in the data set (i.e. entry in column 4 divided by the corresponding entry in column 2); (%)^(b)is the percentage of hot spots of that type with respect to all hot spots in the data set (i.e. entry in column 4 divided by the sum of all entries in column 4). Enrichment in hot spots is calculated as the ratio of the frequency of a given residue type in hot spots (column 6) over the frequency of the same amino acid type in the whole data set (column 3). This table should be compared with Table 2 in [19]. Note that proline and glycine are not included in our data set.

ISSN: 1471-2105