Understanding the molecular basis of EGFR kinase domain/MIG-6 peptide recognition complex using computational analyses

Background Epidermal growth factor receptor (EGFR) signalling plays a major role in biological processes, including cell proliferation, differentiation and survival. Since the over-expression of EGFR causes human cancers, EGFR is an attractive drug target. A tumor suppressor endogenous protein, MIG-6, is known to suppress EGFR over-expression by binding to the C-lobe of EGFR kinase. Thus, this C-lobe of the EGFR kinase is a potential new target for EGFR kinase activity inhibition. In this study, molecular dynamics (MD) simulations and binding free energy calculations were used to investigate the protein-peptide interactions between EGFR kinase and a 27-residue peptide derived from MIG-6_s1 segment (residues 336–362). Results These 27 residues of MIG-6_s1 were modeled from the published MIG-6 X-ray structure. The binding dynamics were detailed by applying the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) method to predict the binding free energy. Both van der Waals interactions and non-polar solvation were favorable driving forces for binding process. Six residues of EGFR kinase and eight residues of MIG-6_s1 residues were shown to be responsible for interface binding in which we investigated per residue free energy decomposition and the results from the computational alanine scanning approach. These residues also had higher hydrogen bond occupancies than other residues at the binding interface. The results from the aforementioned calculations reasonably agreed with the previous experimental mutagenesis studies. Conclusions Molecular dynamics simulations were used to investigate the interactions of MIG-6_s1 to EGFR kinase domain. Our study provides an insight into such interactions that is useful in guiding the design of novel anticancer therapeutics. The information on our modelled peptide interface with EGFR kinase could be a possible candidate for an EGFR dimerization inhibitor.

Background EGFR, also known as ErbB1 or HER, is a receptor tyrosine kinase that mediates in biological process of normal physiology [1]. An over-expression of EGFR has often been associated with a variety of human cancers that can provide the basic properties required for cancer growth, including cell proliferation, anti-apoptosis, metastasis and angiogenesis [2]. These properties render EGFR an attractive target for cancer therapy. Nonetheless, several reports show that the secondary mutations in EGFR cause anticancer drug resistances [3][4][5], instigating a requirement of efficient treatments to overcome these resistances. Zhang et al. [6] showed that a new target therapy could be done by the suppression of EGFR activation through an allosteric mechanism. This approach was supported by an association of the endogenous negativefeedback-inhibitor protein, called MIG-6, that regulates the level of EGFR activation [7,8]. The crystal structure of the EGFR kinase/MIG-6 complex showed that a segment of MIG-6 (residues 337-361), called MIG-6_s1 interacts with the C-lobe of EGFR kinase domain ( Figure 1) [6]. The binding of MIG-6_s1 to EGFR kinase prevents an asymmetric dimer formation, leading to the inhibition of EGFR activation at micro molar binding affinity level [6]. The detailed analysis of the protein-protein interactions (PPI) at an atomic level of EGFR kinase and MIG-6_s1 is yet limited. Molecular dynamics (MD) simulations and free energy calculation can be used to obtain the knowledge on the dynamics of the structures and the binding energy profiles of the protein complexes in solution [9].
In this study, MD simulations were performed on the EGFR kinase/MIG-6_s1 complex to obtain the knowledge on their binding dynamics and structural changes. The binding free energy calculation by the MM-PBSA method was applied to gain a better understanding on the binding energetics of the complex. The per residue free energy decomposition and alanine scanning methods were used to elucidate the residues important for interface binding and to obtain their energetic contributions. Our results provided a better understanding of the binding interactions between EGFR kinase and MIG-6_s1 peptide. Moreover, such information could be used to improve or develop more efficient anti-cancer drugs.

Model setup and molecular dynamics simulations
The starting structures of EGFR kinase/MIG-6_s1 and the asymmetric kinase dimer complexes were based on the X-ray crystal structure from the protein data bank (PDB). The monomer structures of the EGFR kinase activators (residues 678-959) and MIG-6_s1 (residues 337-362) were used (PDB code: 2RFE) [6], while PDB code: 2GS6 represents the EGFR kinase receiver model. The N-and C-termini of EGFR kinase and MIG-6_s1 were capped with an acetyl group (ACE) and an Nmethyl group (NME), respectively [10]. All simulations were performed using AMBER 12 and the ff03 force field [11]. Each system was solvated using the atomistic TIP3P water model [12] in an octahedron truncated box with a buffer distance of 7 Å. Hydrogen atoms were added, and the counter-ions of Na + or Cl − were used to neutralize the system. To remove bad contacts in the crystal structure, we applied three-step minimization to each system before performing the simulations. Positional restraints were applied to the whole system in the first and the second steps with a force constant of 10 kcal/(mol Å 2 ) and 2 kcal/(mol Å 2 ), respectively. In the third step, all atoms were allowed to move freely without restraints. The energy minimization in each step was carried out using 2,000 cycles of the steepest descent algorithm, followed by 300 cycles of the conjugate gradient algorithm. Each system was heated from 0 to 300K during the 100 ps in an NVT ensemble and was further equilibrated for 500 ps in an NPT ensemble. Then it was simulated for 22 ns in an NPT ensemble at 1 atm and 300K, using the 2 fs integration time step. During the simulations, the particle mesh Ewald (PME) method [13] was used to treat the long-range electrostatics. Moreover, the SHAKE algorithm [14] was applied to Figure 1 Cartoon representation of the crystal structure of the EGFR kinase/MIG-6_s1 complex. The positions of MIG-6_s1 residues were spanned on the EGFR kinase interface. eliminate bond-stretching freedom for all bonds involving hydrogen, thereby allowing the use of a 2 fs time step.
Calculation of binding free energy using the MM-PBSA approach The MM-PBSA approach [15] was used to calculate protein-protein binding free energies. 300 snapshots were extracted from the last 3 ns of the MD trajectories to perform MM-PBSA. The binding free energy of each snapshot was calculated as follows: The free energy (G) for each molecule can be computed from the following equations: ΔE (gas) is the gas-phase energy, which is the sum of the internal (E int ), van der Waals (E vdw ) and Coulomb energies (E ele ) (Eq. 3) . Internal energy (E int ) comprises the bond (E bond ), angle (E angle ) and torsion (E torsion ) energies, respectively (Eq.4). The solvation free energy (G sol, PB ) comes from the combination of polar (G PB ) and non-polar contributions (G nonpol, sol ) (Eq.5). Note that G (PB) is the polar solvation calculated by the Poisson-Boltzmann (PB) equation with the dielectric constants of 1.0 for solute and 80.0 for solvent. The non-polar component (G nonpol, sol ) is defined in Eq. 6, where the constant γ was set to 0.0072 kcal/mol, while SASA stands for the solvent-accessible-surface area calculated by the linear combination of pairwise overlap [16] model. Finally, −TΔS in Eq. 2 is an entropy term calculated from the sum of translational, rotational and vibrational components using normal mode analysis. To gain more insights into the residue contributions, the residues of MIG-6_s1 responsible for the binding to the EGFR kinase surface were computed by using the MM-PBSA per residue free energy decomposition method (mm_pbsa module) of AMBER12. These calculations were based on the same snapshots as those used in the binding free energy calculations.

Computational alanine scanning mutagenesis
To investigate the contributions of an individual residue to the binding interface and to identify the "hot-spot" residues that are very important for interface binding, we used computational alanine scanning to identify these key residues of the EGFR kinase/MIG-6_s1 interface as well as the asymmetric dimer interface. Alanine scanning replaces an original residue with alanine; this substitution eliminates the side-chain but does not introduce large conformational changes to alter the mode of binding [17]. The binding free energy of the wild type and mutant were calculated using a post-processing treatment of the mm_pbsa module and the difference in their binding free energies was computed using the following equation [10,18]: Based on the post-processing energy calculation [19,20], the alanine mutant structure was generated by eliminating the residue side-chain with a hydrogen atom of alanine prior to the calculation. Note that the smallest amino acid, glycine, was not included since it introduces conformational flexibility into the protein backbone. Proline was also not included because its backbone conformation differs from that of alanine [21][22][23]. The 300 snapshots extracted from the last 3 ns of the MD runs were used in these calculations. If the difference in binding free energy between the wild type and the mutant of a residue (ΔΔG binding ) becomes positive, such an alanine substitution is favorable, i.e., this residue is probably not important in the interface binding. However, the negative value of ΔΔG binding indicates the unfavorable alanine substitution, i.e., this residue plays significant role in the interface binding [18]. These key residues were clarified as the residues that significantly decrease the binding free energy for at least 2.0 kcal/ mol. From literatures, residues with strong binding should reduce the binding free energy greater than or equal to 4 kcal/mol [21,24].

Results and discussion
Structural stabilities and flexibilities of the systems To assess the dynamic stabilities of the systems, we performed MD simulations of EGFR kinase and MIG-6_s1 in both unbound and bound states with explicit water for 22 ns at 300K under 1 atm pressure. The equilibration of MD simulations can be observed from the convergence of the root-mean-square deviation (RMSD) plot, based on the backbone (Cα, N and O) atoms of each snapshot as compared to the initial energy minimized structure.
The RMSD plots of three systems: 1) unbound EGFR kinase 2) unbound MIG-6_s1 and 3) EGFR kinase/MIG-6_s1 complex are shown in Figure 2. For the bound system, beside the plot for overall complex (black line), we also plotted the RMSD values for only EGFR kinase (red line) and for only MIG-6_s1 (green line). The RMSD plots of "unbound EGFR kinase", "EGFR kinase/MIG-6_s1 complex" and "only EGFR kinase in complex" reached their plateaus after about 2 ns. The RMSD plots of only MIG-6_s1 in the EGFR complex reached its plateau after about 7 ns. These three systems were equilibrated during the 22 ns simulations.
The conformational changes in the bound and unbound simulations were compared ( Figure 2). Both bound and unbound EGFR kinase RMSD plots revealed similar patterns with average RMSD value of~3.0 Å for unbound EGFR and~2.9 Å for bound. For bound vs. unbound MIG-6_s1, the RMSD plot of bound MIG-6_s1 is more stable than that of unbound state with average RMSD of~4.2 Å for bound MIG-6_s1 and~6.4 Å for the unbound MIG-6_s1. It can be seen that the bound MIG-6_s1 was significantly more stable than the unbound state as seen in the lower RMSD fluctuation. Note that the unbound MIG-6_s1 has generally higher RMSD values because MIG-6_s1 is rather short peptide (27 amino acids) and becomes very flexible in solution. This flexible short peptide phenomenon were also presented in [17,25] where the unbound short peptide p53 could not be stabilized in solution but could be bound stably to the binding cleft of MDM2 protein. Thus, from the RMSD results, our MD simulations are reliable enough for further investigation.
The root-mean-square fluctuations (RMSF) of αC atom from the MD simulated structures against the starting structure were calculated to identify the most flexible regions of the protein-peptide structure. The larger RMSF value conveys more flexible region while the lower RMSF value entails the more constrained region.
The comparison of both unbound EGFR system and EGFR kinase/MIG-6_s1 complex system based on their RMSF profiles ( Figure 3A). The RMSF profile of unbound EGFR kinase system (green) showed similar behavior to that of EGFR complex system (red). On the contrary, the RMSF profile of MIG-6_s1 in the unbound state showed much higher fluctuation than that of its complex state. When comparing the RMSF EGFR kinase profile with the dimer complex profile (purple), we observed distinguishable patterns in several regions of the EGFR interface to MIG-6_s1. Within the EGFR N-lobe region, similar RMSF profiles from all three systems were observed; however the corresponding RMSF profiles were lower than the other regions reflecting lower flexibility.
RMSF patterns and their values in the EGFR C-lobe region differ greatly from that of N-lobe region. Particularly, residues from both EGFR kinase/MIG-6_s1 complex and unbound EGFR kinase, especially in the junction αD-αE, and along the three αG, αH and αI helices revealed high RMSF values (reaching 5-6 Ǻ); while the EGFR kinase in dimer complex shows stable RMSF value~3 Ǻ. We hypothesize that during EGFR dimerization these helices must interact with the juxtamembrane B portion of the EGFR receiver; hence losing the dimerization causes the EGFR activator to have higher RMSF (see unbound EGFR kinase). Furthermore, we observed uncertain fluctuations in the part of Ala840-Gly850, called A-loop, which correlates with the missing residues in the EGFR X-ray crystal structure ( Figure 3B) [26].
Toward to the end of the RMSF plots, the magnitude of unbound MIG-6 plot (green) differed greatly from that of complex of EGFR/MIG-6_s1 plot (red). In particular the RMSF of MIG-6_s1 in EGFR complex (red) was about 5-6 Ǻ, while the RMSF of unbound MIG-6_s1 (green) swung much higher. These evidences suggest that the binding of EGFR kinase could help MIG-6_s1 structure to stabilize in complex.

The conformational changes of the EGFR kinase/MIG-6_s1 complex
Twenty-seven residues of MIG-6_s1 (residue 336-362) can bind to the distal surface of the C-lobe in the EGFR kinase domain to prevent the activation of EGFR kinase by asymmetric dimer (Figure 1). To gain a better understanding on the structural differences between the initial structure and the average simulated structure, the superimposition of the minimized initial structure of the EGFR kinase/MIG-6_s1 complex and its average structure from the 300 snapshots of the last 3-ns simulations are shown in Figure 4. The conformations of the average structure and the minimized initial structure were mostly similar, except for the shift in the loop region of MIG-6_s1. Specifically, Asn343 of MIG-6_s1 at the loop region of the average structure shifted its side-chain up closer to the EGFR kinase surface region than that of the minimized initial structure (Figure 4). This contact was not observed previously in the initial structure possibly due to crystal packing contacts in the flexible loop segment [27].  In terms of hydrophobic interactions, the hydrophobic pocket of the EGFR kinase interface consists of Leu343, Trp881, Thr885 and Pro913, and the side-chain of Met346 of MIG-6_s1 resided stably in this pocket, as shown in Figure 5 (plotted by LIGPLOT program [28]). Since prolines can affect the rigidity of peptide structures, the conformations of prolines were also analyzed in this study. Five prolines of MIG-6_s1 (Pro347, Pro348, Pro354, Pro356 and Pro339) were in cis-configurations and located on the turn. Four of these residues (Pro347, Pro348, Pro354 and Pro356) remained in bent conformation at the interface ( Figure 6) [6]. Prolines on the turn might also form interactions between the proline ring and nearby aromatic residues (CH-π) [29] and help stabilize the MIG-6_s1 complex. Similar to prolines in the SH3 domain, the positions of two prolines were relatively fixed to constrain the structural movements [30].

The calculations of binding free energies
To compute the binding free energies of MIG-6_s1 peptides to EGFR kinase and to gain insights into the peptide-protein binding interactions, the MM-PBSA approach was applied to the 300 snapshots taken from the last 3 ns of the production runs. The calculated results are presented in Table 1. The predicted binding free energy of EGFR kinase/MIG-6_s1 complex was-142.7 kcal/ mol. The major favorable components of the MIG-6_s1 binding were the van der Waals term in gas-phase (ΔE vdw =-120.8 kcal/mol) and the non-polar part of the solvation free energy term (ΔG nonpol, sol =-84.25 kcal/ mol). These two terms resulted in the total favorable non-polar interactions of-205.1 kcal/mol. The highly favorable non-polar part of the free energy might come from the hydrophobic interactions between MIG-6_s1 and EGFR kinase as well as the desolvation of the nonpolar groups of MIG-6_s1 from water that aligned them with the binding interface [10]. Such a phenomenon could be seen in several protein-protein interactions including the interaction between MDM2 and p53, where van der Waals interaction was the major contributor to the inhibitor binding originated from the effects of solutesolvent attractive forces on hydrophobic and the dispersion [10,31].
On the other hand, the favorable columbic term in gas phase (ΔE ele =-130.4 kcal/mol) was completely cancelled by the unfavorable contribution of the polar part of the solvation free energy (G PB = 151.6 kcal/mol), resulting in the total unfavorable electrostatic interactions of 21.2 kcal/mol. This compensation phenomenon was discussed in several studies of protein-protein interactions in solution, including the binding processes of IGF-II/ IGF2R and hGH/GHR, giving the polar interactions (ΔE ele + G PB ) of 31.57 and 134.2 kcal/mol, respectively [10,32]. Moreover, the magnitude of TΔS is in the range of +30 to-40 kcal/mol, being consistent with the previous results [32]. Figure 5 The interactions of Met346 with its nearby residues in the average structure (plotted by LIGPLOT). Met346 is shown in the ball-and-stick representation and its neighbours are shown in brown. Carbon, nitrogen and oxygen atom are shown in black, blue and red, respectively. Dark red "eyelashes" represent hydrophobic contacts between Met346 and its neighbours. Black eyelashes correspond to atoms involved in hydrophobic contacts.

Free energy decompositions of the interfacial residues
The technique of per-residue binding free energy decomposition can reveal the contributions of the key residues responsible for the protein-protein interactions at the interface. The total of 300 snapshots extracted from the last 3 ns of the MD trajectories were decomposed by the MM-PBSA method. The important binding residues of the EGFR kinase/MIG-6_s1 interface were extracted using the residue cut-off at ΔG tot,PB ≤-1.0 kcal/mol (favorable binding) (Figure 7). There were13 important binding residues from MIG-6_s1 and 9 residues from EGFR kinase with binding free energies less than-1 kcal/mol (Figure 8). The 13 residues of MIG-6_s1 comprise Ser337, Leu338, Pro339, Tyr341, Met346, Pro348, Thr349, Gln350, Phe352, Lys357, Tyr358, Val359 and Ser361 whereas the 9 residues of EGFR kinase were Thr885, Glu904, Gly906, Arg908, Pro910, Gln911, Pro913, Met928 and Ile929. Some of these residues, such as Pro910, Thr349, Gln350 and Tyr358, had the values of the binding free energies less than or equal to -4 kcal/mol. Five residues of MIG-6_s1, namely Leu342, Asn343, Ser351, Asp355, and Ser360 and seven residues of EGFR kinase, namely Glu907, Leu909, Pro912, Tyr920, Val924, Trp927 and Gly959, had their binding free energies approximately -1.0 kcal/mol. We further analyzed the binding free energy component of each residue based on the van der Waals energy and the sum of electrostatic contribution in the gas-phase and the polar part of the solvation energy ( Figure 8). The results suggested that the main favorable contribution to the binding free energy was essentially the van der Waals interactions. Normally, the favorable electrostatic energies are opposed by the unfavorable polar    parts of the solvation free energies (desolvation of the polar groups), resulting in unfavorable contributions. However, some residues had slightly favorable electrostatic contributions (the sum of the electrostatic energies and the polar parts of the solvation free energies) including Thr885, Glu904, Pro910, Gln911, Pro912 and Trp927 of EGFR kinase as well as Ser337, Leu338, Ser340, Tyr341, Val345, Thr349, Pro354, Asp355, Lys357, Val359 and Ser361 of MIG-6_s1. Table 2 shows the information on hydrogen bonds between MIG-6_s1 peptide and EGFR kinase during the last 3 ns of the simulations. These are the hydrogen bonds formed between residues Ser337, Tyr341, Asn350, Phe352, Lys357, Tyr358 and Ser360 of MIG-6_s1 and the residues Glu904, Gly906, Arg908, Gln911 and Ile929 of EGFR kinase. Strong, medium and weak hydrogenbond interactions were defined as having simulated hydrogen bond occupancy of > 75%, 50-75%, and < 50%, respectively. The hydrogen bond distance profiles of some of the strong hydrogen bond interactions during the 22 ns of the simulations are shown in Figure 9. Most of these hydrogen bonds were stable throughout the entire simulations ( Figures 9A, B, C, D, E, F, and G), except for hydrogen bonds shown in Figures 9B, I and J. In particular, the hydrogen bond between Tyr341-OH and Gln911-O was briefly disrupted during 13.5-16 ns of the simulations ( Figure 9B). The hydrogen bonds between Asn343-ND2-HD21 and Thr885-O ( Figure 9I) and between Asn343-NH and Gly959-O ( Figure 9J) were not stably formed until around 13 ns and 18 ns, respectively.

Computational alanine scanning technique
Computational alanine scanning is a powerful technique that not only monitors the key residues required for the interactions between the two protein partners at the interface but also computes the energetic contribution to the binding free energy of the individual side-chain based on each alanine mutation [33]. In order to identify the key residues from the 29 residues along the binding interface of the EGFR kinase/MIG-6_s1 complex, we substituted alanine in amino acids having the free energy contribution less than or equal to-1.0 kcal/mol in order to classify them as either "hot spot" or "warm spot" residues. Note that the higher negative value of ΔΔG of an alanine point substitution represents the better binding state of the residue. A residue is classified to be very important for binding to the protein partner (hot spot) [21,24,34], if its ΔΔG ≤-4 kcal/mol. To a lesser extent, the "warm spot" will be classified if-2 kcal/mol ≥ ΔΔG >-4 kcal/mol. In this analysis, proline and glycine residues were not included since their backbone conformations differ from alanine.
The results from the alanine scanning method ( Figure 10A and B) and from the per residue free energy decomposition technique (Figure 8) are largely consistent. As determined by the free energy decomposition technique, most of the important residues were also classified as hot spot residues. We showed that there were six "hot spot" residues on EGFR kinase, namely Glu904, Glu907, Arg908, Gln911, Met928 and Ile929; and eight "hot spot" residues on MIG-6_s1, namely Leu338, Try341, Asn343, Met346, Thr349, Gln350, Phe352 and Try358. For the warm spot classification four residues, Thr885, Ile94, Tyr920 and Val924, were on EGFR kinase ( Figure 10A) while Leu342, Val359, and Ser361 were on MIG-6_s1 ( Figure 10B). However, due to the limitation of alanine scanning, Gly906, Pro910, Pro912, Pro913, Gly959 on EGFR kinase and Pro339, Pro348 on MIG-6_s1 could not be alanine scanned.
Note that by means of per residue free energy decomposition, eight residues (Tyr891, Leu909, Trp927, Ser337, Ser351, Asp355, Lys357 and Ser360) were identified as important residues and not by the alanine scanning approach. Such discrepancies may stem from the differences in the energy cut-offs used in these two methods. Moreover, the alanine scanning method verified only the contributions from the side-chains, while the free energy decomposition analysis considered both the contributions from the side-chains and the backbones. It may be possible that the contributions from the backbones of these residues may be more than those of the side-chains [18]. The common "key" residues identified by both alanine scanning and the free energy decomposition methods, such as Val924, Met346, Phe352 and Try358, are supported by the previous mutagenesis experiment as the alanine replacements of these residues abolished the peptide binding [6]. The distance cut-off was 3.5 Å and angle cut-off was 120°. The interactions of the key residues at the interface of the EGFR kinase/MIG-6_s1 complex The common key-residue results from both the per residue binding free energy decomposition and computational alanine scanning technique predicted the most important residues at the binding interface that include six residues of the EGFR kinase and eight residues of the MIG-6_s1peptide segment ( Figure 10A and B). These residues have their ΔΔG ≤-4 kcal/mol after performing alanine substitution. Both different and consistent interaction profiles between the MD simulations and the starting structures were observed and presented using The key residues of MIG-6_s1. (C) Hydrogen bond formation between interfaces presented in the starting structure and the 3 ns last of simulations. Interfacial hydrogen bonds presented in green line with the occupancies higher than 50% (D). The involving of hydrogen bond and key residues between two interfaces presented in the starting structure and the last 3 ns of simulations. The "hot spot residues", "warm spot", null spot and not studied residues are shown in red, orange, white and gray respectively. Interfacial hydrogen bonds of the starting structure and MD structure presented in dark green and blue bars, respectively.
hydrogen bonding network ( Figure 10C). The resulting relationship does not only exhibit their interfacial residues but also some involvements of bonding in the important residues ( Figure 10D). The MD simulations mostly reveal stable pattern of hydrogen bonds, with the exception for the bonding of Ser351 to Glu907. Interestingly, these hydrogen bonds were also involved in the important residues on the interface. This suggests that the main contributions to the binding interactions were from the hydrogen bonds. Mutagenesis studies of MIG-6_s1 revealed the importance of Met346, Phe352 and Tyr358 for EGFR kinase binding [6]. With the free energy contribution of approximately-2.8 kcal/mol, Met346 is probably the most important residue of MIG-6_s1 that contributes to the hydrophobic interactions at the interface of the EGFR kinase/MIG-6_s1 complex. The side-chain of Met346 was buried in the hydrophobic pocket consisting of Leu343, Trp881, Thr885 (warm spot) and Pro913 of the EGFR kinase (Figures 4 and 5). The mutating Met346 to Ala346 costs approximately-8.1 kcal/mol of the magnitude of the binding free energy, supporting the importance of its side-chain that engages in the pocket of EGFR kinase. These results were consistent with the crystal structure showing Met346 buried in the hydrophobic pocket of the EGFR kinase, and the experimental results that the mutation of Met346 could abolish EGFR binding [6]. Phe352 and Tyr358 of MIG-6_s1 were also identified in this study as the essential residues for the interface binding with the energetic contributions of about-3.4 and-6.2 kcal/mol, respectively. Their backbones formed hydrogen bonds with hot spot residues, Gly906 and Arg908 (hot spot), respectively, and their side-chains were also buried in the hydrophobic cleft ( Figure 11A, B and C). Moreover, the phenyl ring of Phe352 and the phenol ring of Tyr358 formed the paralleled π-π interaction with the distance of 5.0 Ǻ between two centroids (<12.0 Ǻ) ( Figure 11C) [29]. Therefore, the mutations of Phe352Ala and Try358Ala could result in the loss of van der Waals contributions from the hydrophobic interactions of their side-chains [6].
Electrostatic and hydrogen bond interactions were also crucial factors for the binding of MIG-6_s1 and EGFR kinase. Tyr341 of MIG-6_s1 was an important residue for the binding of MIG-6_s1 and EGFR kinase with per residue contribution of-1.88 kcal/mol. During the simulations, Tyr341 formed a hydrogen bond with Gln911 (hot spot) and was also buried in the hydrophobic pocket of Pro339, Pro347, Pro910 and Pro913 ( Figure 11D). Moreover, the value of ΔΔG of the alanine substitution of Tyr341 is about-7.9 kcal/mol, which was influenced by the loss of van der Waals and electrostatic interactions. Another residue important for the binding of this complex is Thr349 of MIG-6_s1. Thr349 formed a hydrogen bond with Glu907 (hot spot) of EGFR kinase ( Figure 11E). The value of ΔΔG of the alanine substitution of Thr349 is about-7.4 kcal/mol, caused mainly by the loss of electrostatic contributions. Moreover, the backbone of Gln350 formed two hydrogen bonds with the backbone of Arg908 (hot spot), while its side-chain formed not only a hydrogen bond with Gln911 (hot spot) but also formed van der Waals interactions with Pro910 and Try920 (warm spot) ( Figure 11F). The value of ΔΔG of the alanine substitution of Gln350 is about-4.3 kcal/mol, where van der Waals and electrostatic contributions were lost upon the side-chain mutation.
During the simulations, not only the new hydrogen bond between Ser361 and Glu904 was formed, but the hydrogen bond involving Asn343 at the loop region of MIG-6_s1 was also observed. The side-chain of MIG-6_s1 moved toward the EGFR kinase surface and formed two hydrogen bonds with Thr885 (warm spot) and Gly959 ( Figure 11G). These hydrogen bonds started to form around 13 ns and 17 ns and became stable later on with the occupancies of 89.6% and 97.7% for Asn343-Thr885 and Asn343-Gly959, respectively ( Figure 9I and J). The ΔΔG values of the alanine substitution of Asn343 was-4.9 kcal/mol, where the main contribution came from the loss of van der Waals interactions due to the increased cavity volume after alanine substitution ( Table 3). The importance of this residue was also supported by the free energy decomposition per residue analysis of the 9-and 18-residue with the values of -0.86 and -1.11 kcal/mol, respectively. This Asn343 residue could play a very important role for a peptide inhibitor to bind to the EGFR C-lobe interface. However, more studies may be needed to further confirm the importance of this residue.
Furthermore, the two MIG-6_s1 prolines (Pro339 and Pro348) and those from EGFR kinase domain (Pro910 and Pro913) could also be essential for the interface binding. They not only gave the free energy contributions of about-2.8,-2.2,-4.55 and-3.40 kcal/mol, respectively (these values were larger than those of the other proline residues), but their free energy contributions were more likely to come from the feature of the side-chains that were-2.34,-2.15,-2.71 and-3.19 kcal/mol more than those of their backbones, respectively. Moreover, the proline rings could also form interaction with the sidechain of Tyr341 and Met346. However, the alanine scanning technique could not be done because the mutation of proline to alanine causes structural perturbations of the backbone [21][22][23].
We also performed alanine scanning on the EGFR kinase binding interface, on ten residues, namely, Thr885, Ile894, Glu904, Glu907, Arg908, Gln911, Tyr920, Val924 Met928 and Ile929. In particular, Thr885 is located within αF of kinase domain and the side-chain pointed to the interface. Thr885 can form hydrogen bonds with Asn343 (warm spot) as well as forming the hydrophobic pocket with side-chain of Met346. However, the loss of ΔΔG (-2.1 kcal/mol) was from the loss of van der Waals and electrostatic interactions but they were not large effects.
The negatively charged residues of EGFR kinase interface, Glu904 (αG) could form hydrogen bonds with the uncharged residues of MIG-6_s1, namely Ser360 and Ser361 (warm spot), while the side-chain of Glu907 (at the loop between αG and -αH) formed hydrogen bond with Thr349 side-chain (hot spot). These Glu904 and Glu907 amino acids were classified as hot spot residues due to the mutation of two glutamic acids resulting with the ΔΔG costs of-4.75 and-4.43 kcal/mol, respectively. The loss of electrostatic contribution was the major reason of such results.
Likewise, the positively charged Arg908 located between αG and αH possibly made strong hydrogen bonds with the backbone of Gln350 (hot spot) and the side-chain of Tyr358 (hot spot). After alanine mutation, Arg908 was identified as a hot spot residue because of its ΔΔG-15.65 kcal/mol that may have been influenced by the increasing cavity volume and the disrupting the conventional hydrogen bonds upon mutating the side-chain. Moreover, the "nitrogen-containing" side chains of Arg908 lining in parallel with the aromatic ring of Tyr358 also formed the cation-π interaction. Although this geometry layout is preferred in protein structures, the mutation of the side-chain could render large favorable electrostatic contribution in the polar solvation [35].
Gln911 (on the loop between αG-αH) was also indicated as a hot spot residue. It disrupted strong hydrogen bonds to Ser337 side-chain upon alanine substitution causing the binding free energy-7.48 kcal/mol, but did not disturb a hydrogen bond of the backbone-side chain between Gln911 and Tyr341. The alignment on αH of Tyr920, Val924 and Met928 showed that they are important to the dimer formation, which were called as "The core of the asymmetric EGFR kinase domain" [36]. These residues not only participate during the hydrophobic interactions that involve residues in the N-lobe of EGFR kinase receiver, but their mutations also abrogated the contribution of the asymmetric dimer [6,36]. These experimental results align with our in silico predictions, these residues were superimposed to those in different structures as shown in the RMSF plots (Figure 3) demonstrating the stability of their backbones. Moreover, we took the EGFR kinase activator interface when binding to MIG-6_s1 and compared with the one when binding to the EGFR kinase receiver. The important of these residues were observed on the above EGFR kinase activator interfaces. After alanine mutations, the loss of van der Waals interaction could originate from the disruption of hydrophobic interaction to their partner proteins. The binding energies of Tyr920, Val924 and Met928 values are-3.65,-2.75, and-5.30 kcal/mol (when binding to MIG-6_s1) and-3.46,-4.98, and-6.07 kcal/mol (when binding to receiver kinase), respectively. The sharing residues between two interfaces of the activator entail that MIG-6_s1 might bind the EGFR kinase activator in the manner similar to a binding of the receiver kinase, even though the function of MIG-6_s1 was opposed to the receiver kinase [6,37].

Conclusions
In this study, the 22 ns MD simulations of EGFR kinase/ MIG-6_s1 complex was performed in order to provide more understanding about the binding of MIG-6_s1 peptide to the C-lobe of EGFR kinase that is an important target in cancer therapy. Using both per residue free energy decomposition and computational alanine scanning techniques, we found that the Asn343 residue situates on a flexible loop of MIG-6_s1. This particular loop was absent in the original X-ray crystal structure. Furthermore, the underlying loop potentially renders Asn343 a key residue to bind to EGFR kinase. This finding requires further experimental validation to better our understanding about Asn343 role in the proteinprotein interaction.
In addition, we used the above energy calculation techniques to identify the hot spot residues (six residues on EGFR kinase and eight residues on MIG-6_s1) that play significant role in the binding between the two proteins. Particularly, the alanine scanning technique showed that the alanine mutations on these hot spot residues would worsen the EGFR kinase/MIG-6_s1 interaction. This technique also revealed the other important residues but to a lesser extent, called warm spot residues that might be worth to further investigate their binding roles. These key amino acids proposed in this work should play vital role for a peptide-based inhibitor to prevent the EGFR asymmetric dimer formation. This in silico study, hence, provides valuable information for designing new peptide-based cancer drugs.