Volume 10 Supplement 1
A computational analysis of SARS cysteine proteinase-octapeptide substrate interaction: implication for structure and active site binding mechanism
© Phakthanakanok et al; licensee BioMed Central Ltd. 2009
Published: 30 January 2009
SARS coronavirus main proteinase (SARS CoVMpro) is an important enzyme for the replication of Severe Acute Respiratory Syndrome virus. The active site region of SARS CoVMpro is divided into 8 subsites. Understanding the binding mode of SARS CoVMpro with a specific substrate is useful and contributes to structural-based drug design. The purpose of this research is to investigate the binding mode between the SARS CoVMpro and two octapeptides, especially in the region of the S3 subsite, through a molecular docking and molecular dynamics (MD) simulation approach.
The one turn α-helix chain (residues 47–54) of the SARS CoVMpro was directly involved in the induced-fit model of the enzyme-substrate complex. The S3 subsite of the enzyme had a negatively charged region due to the presence of Glu47. During MD simulations, Glu47 of the enzyme was shown to play a key role in electrostatic bonding with the P3Lys of the octapeptide.
MD simulations were carried out on the SARS CoVMpro-octapeptide complex. The hypothesis proposed that Glu47 of SARS CoVMpro is an important residue in the S3 subsite and is involved in binding with P3Lys of the octapeptide.
The human coronavirus is a major cause of respiratory syndrome and in particular of a disease called Severe Acute Respiratory Syndrome (SARS). This disease spread rapidly from China to several countries during 2003. The enzyme called Severe Acute Respiratory Syndrome Coronavirus Main Proteinase (SARS CoVMpro) is an important enzyme involved in the life cycle of the human coronavirus. The crystal structure of the SARS CoVMpro complexed to the inhibitor has been solved previously [1–5]. This enzyme is a member of the cysteine proteinases and exhibits the typical catalytic diad, cysteine and histidine, in the active site [6–12]. Previous studies on substrate specificity indicated that the active site, including the binding site, of the SARS CoVMpro could bind specifically to a substrate containing 8 amino acids. These positions in the octapeptide were termed P5-P4-P3-P2-P1-P1'-P2'-P3' and the sequence Ser-Ala-Val-Leu-Gln-Ser-Gly-Phe. This sequence is optimal for the proteinase from Transmissible Gastro Enteritis Virus (TGEV), but it is unsuitable for SARS CoVMpro . Recently, investigations of substrate specificity against SARS CoVMpro proposed that the octapeptides with sequences of Ser-Ala-Val-Leu-Gln-Ala-Gly-Phe and Thr-Val-Lys-Leu-Gln-Ser-Gly-Phe are optimal for cleavage by SARS CoVMpro. It is interesting to note that when the P3 position of this latter octapeptide was changed from Val to Lys it caused an increase in the rate of catalysis (kcat/Km) of the SARS CoVMpro of 4.31 fold . However, information on interactions between P3Lys of the octapeptide and the S3 subsite of the SARS CoVMpro remain unclear. Therefore, our research intends to study the interactions between P3Lys and the S3 subsite in order to investigate the amino acids in the S3 subsite that are critical for binding P3Lys of the octapeptide. In addition, the two octapeptides Thr-Val-Arg-Leu-Gln-Ala-Gly-Phe and Thr-Val-Ile-Leu-Gln-Ala-Gly-Phe were used to investigate these interactions. At the P3 position these octapeptides contained either a long chain positive charge or an aliphatic hydrophobic amino acid. This was proposed to prove that the P3Lys is a significant amino acid for binding in the S3 subsite of the SARS CoVMpro. Thus, molecular modeling techniques can be used to clarify this problem [14, 15]. In the present paper, we present the 2 ns conventional MD simulation of the SARS CoVMpro complexed with the octapeptide and compare this to an uncomplexed SARS CoVMpro, with the purpose of investigating the amino acids in each subsite, and especially S3, which are crucial to increasing the substrate binding. Hence, the results obtained from this research are very important in the understanding of the binding mechanism and catalytic mechanism of the SARS CoVMpro.
Visualization and computational
All steps in this research were performed in silico using molecular modeling software. Preparation of all three-dimensional structures used Insight II (version 2001) from Accelrys . Molecular docking and molecular dynamics (MD) simulations used Autodock 3.0.5, AutodockTools and GROMACS 3.3.1 [17–19]. The results of molecular docking and MD simulations were analyzed using Discovery Studio 2.0.1 . All of the calculations were performed on a 48 processor Itanium cluster at the National Electronic and Computer Technology Center (NECTEC), Thailand.
Preparation the structures of the SARS CoVMpro and the octapeptide
The structure of the SARS CoVMpro was taken from the Brookhaven protein data bank using the PDB code 1UK4. The structure was checked and the missing atoms replaced using the Builder module of the Insight II. The polar hydrogen atoms were added followed by energy minimization in vacuo with steepest descent method for 1,000 steps. The final structure obtained from energy minimization was used in all further steps. The structures of the octapeptides were prepared based on the research of Fan et al. . Three octapeptides were constructed, including hydrogen atoms, by using the Builder module and the following sequences: Thr-Val-Lys-Leu-Gln-Ala-Gly-Phe, Thr-Val-Arg-Leu-Gln-Ala-Gly-Phe and Thr-Val-Ile-Leu-Gln-Ala-Gly-Phe. These structures only differed at the P3 position. All of the structures were subjected to energy minimization in the same manner as that for SARS CoVMpro. The structures obtained from energy minimization were utilized for further docking studies.
For this research, the structure of the SARS CoVMpro was prepared in the catalytically competent conformation. It is believed that this structure was in the natural form and ready for substrate catalysis [14, 15]. Firstly, the structure of the SARS CoVMpro, obtained from the energy minimization, was subjected to primary MD simulation. To perform MD simulation the structure of the enzyme was set to the GROMOS96 43a1 force field with explicit hydrogen atoms in the aromatic rings. The simulation cell was created in a cubic periodic box with a minimum distance of 0.9 Å between the protein and the box walls. The enzyme was soaked with approximately 40,000 water molecules defined using the simple point charges (SPC) of water model. The Glu and Asp of the SARS CoVMpro were set a charge of -1. Lys was set a charge of +1. His had an added hydrogen atom at the B position. As the total charge of the system was -6, six atoms of Na+ were added to the system to adjust the charge to neutral. Electrostatic interactions between charged groups at a distance of less than 3 Å were calculated explicitly. Energy minimization was performed by using 1,000 steps of steepest descent method. Long-range electrostatic interactions were calculated using the Particle-Mesh Ewald method (PME) with a grid width of 1.2 Å and a fourth-order spline interpolation. A cutoff distance of 9 Å was applied for Lennard-Jones interactions. To maintain the system at a constant temperature of 300 K, a Berendsen thermostat was applied using a coupling time of 0.1 ps. Pressure was held at 1 bar, with a coupling time of 1 ps. The time step was set as 2 fs and the simulation performed for 100 ps. The structure of the SARS CoVMpro obtained at the last time point (100 ps) was selected for further molecular docking studies.
Molecular docking studies
In the molecular docking, the interaction between the SARS CoVMpro and the three octapeptides were investigated. The structure of the enzyme obtained from the final step of structure refinement and the structures of the ocapeptide obtained from energy minimization were used in all steps. The calculations employed a Lamarckian Genetic Algorithm (LGA) with partial flexible rotatable bonds. The structure of the octapeptide was defined as 19 rotatable bonds, situated along the backbone, by fixing the position of the side chains. The grid points for Autogrid calculations were set to be 80 × 80 × 80 Å with the sulfur atom of SARS CoVMpro Cys145 assigned as the center of the grid box. The docking parameters were set to a LGA calculation of 1,000 runs. The energy evaluations were set to 1,500,000 and 27.000 generations. Population size was set to 100 and the rate of gene mutation and the rate of gene crossover were set to 0.02 and 0.8 respectively. At the end of the calculations, the obtained conformations were summarized, collected and extracted using the AutodockTool. The docked conformations were clustered by set the Root Mean Square Deviation (RMSD) as 2.0 Å. The first ranked presented the highest members was selected. The conformation that exhibited the lowest docking energy in that cluster was selected for analysis and further MD simulation.
Molecular dynamics simulation
In order to perform molecular dynamics simulations, three structures of the SARS CoVMpro complexed with the octapeptides obtained from the docking, were prepared. The parameters of the simulation were adjusted to be the same as in the structure refinement step described above. However, one atom of the Na+ was also added in each system of the enzyme-substrate complexed of the octapeptide P3Lys and P3Arg due to it neutralized the amino acid, Lys and Arg. In the simulation, each time step was set to 2 fs and the simulation of the whole system performed for 2,000 ps (2 ns). The structure of the enzyme-substrate complexes were extracted every 1 ps for analysis.
The results of simulations with each structure were compared against each other. Values of the root mean square deviation (RMSD) and the root mean square fluctuation (RMSF) were monitored during the simulations. These values were the criteria used to describe the motion of the protein chains, including the amino acids, in the active site of the SARS CoVMpro. The atomic distances and number of hydrogen bonds were also investigated to explain the interaction between the SARS CoVMpro and the octapeptide in the enzyme-substrate complex.
Results and discussion
Catalytically competent conformation of the SARS CoVMpro
The active site of the SARS CoVMpro was divided into 8 subsites. Thus, the subsites namely S5-S4-S3-S2-S1-S1'-S2'-S3' accommodated the corresponding peptide residues named P3-P2-P1-P1'-P2'-P3', respectively. It is known that the S2 subsite is a hydrophobic pocket and it is located near the catalytic dyad between domains I and II. The S1 subsite is a deep hydrophilic pocket and is located close to Cys145, in part of domain II .
The docking energy of the three octapeptides was calculated. The conformation with the lowest docking energy from the best cluster (highest of each substrate member) was selected. The docking energy was a criterion that was used to judge specificity of the substrate. This meant that, the lower docking energy referred to a higher specificity. For comparison, the energies obtained from docking of each octapeptide are listed in Table 1, showing the octapeptide P3Lys had an energy of -14.23 kcal/mol. It had the lowest docking energy so it was the most specific. Below these were P3Arg and P3Ile, which had docking energies of -13.26 and -8.11 kcal/mol, respectively. Hence, the octapeptide P3Ile had the lowest specificity.
The docking energy. The energies obtained from the interactions between the SARS CoVMpro and the octapeptide substrates.
Docking energy (kcal/mol)
MD simulations of the interactions between the SARS CoVMpro and the octapeptide in form of enzyme-substrate complexed
The equilibrium point of this system was about 100 ps. The average value of the total energy was determined between 100 and 200 ps. This showed that the energy of the complex system with octapeptide P3Lys, P3Arg and P3Ile were -104521, -104482 and -104216 kcal/mol, respectively. These results showed that the complex of octapeptide P3Lys had the lowest energy, whereas P3Ile had the highest energy. In contrast to the enzyme free form, the results of the total energy value was unstable in the first 1500 ps. The values changed distinctly in the first 100 ps and then reached a stable level up to 500 ps. After this time, the values were changed again by first increasing and then decreasing until 1500 ps. However, the values reached an equilibrium level at nearly 2000 ps. From these results, we have suggested that the variation of the total energy in the enzyme free form is caused by movement of the flexible chain, one turn α-helix and long loop, in the active site of the SARS CoVMpro. On the other hand, in the active site of the SARS CoVMpro complex, the octapeptide contributed to maintaining the conformation of the enzyme. To examine this further, the motion of the octapeptides was also investigated.
It is possible to interpret from the results that the variability in the first 100 ps was due to the structure of the octapeptides adapting, after which the P3Lys was a close fit to the active site cleft of the SARS CoVMpro. After this time, the octapeptide P3Arg and P3Ile were still not in the correct confirmation for the active site cleft and they moved about 0.3–0.35 nm away. However, it is also possible to suggest that the movement was only happening on some of the residues of the octapeptide, because the RMSD values stabilized after 500 ps. This indicated that some residues of P3Arg and P3Ile remained in the active site cleft.
Movement investigation of the amino acids involved with the interactions between the SARS CoVMpro and the octapeptide
The RMSF of the enzyme and the octapeptide has also been investigated. This criterion could describe which residues were either moving or fixed during simulation. The values were calculated by averaging movement in the equilibrium period from a time of 500–2000 ps of simulation. The results shown in Figure 5 demonstrate that the fluctuation of the amino acids in the active site (highlight of the amino acid residues number) of the enzyme complexed with P3Lys (Figure 5A), were lower than with the other octapeptide complexes or the free form. The RMSF in the enzyme complexed with P3Arg also had low values, but these were still greater than those seen for P3Lys (Figure 5B). This is in contrast to the RMSF values in complex with P3Ile (Figure 5C) which they still remain high values. However, the value of the long loop chain (right highlight) is seen to be consistent for all structures. The results also indicated that in the enzyme free form (Figure 5D) the one turn α-helix moved during simulation, which is in contrast to the enzyme complexed with octapeptide P3Lys, where the one turn α-helix chain maintained its position. From the results of MD simulations, it was found that the octapeptides in which P3 is either Lys or Arg have not fallen out of the active site of SARS CoVMpro during simulation. At the first time period of MD simulations, these P3 have their side chains pointed out to solvent and that after the observed conformational changes these would then point upward to Glu47, and not downward to Glu166. The results also found that, at equilibration of MD simulations, an electrostatic interaction was formed between S3Glu47 and P3Lys or P3Arg with a distance of 3.5–5.0 Å.
The hydrogen bonds of the enzyme-substrate complex
The MD simulation results and the number of hydrogen bonds formed in the SARS CoVMpro-octapeptide complex have been obtained. These results indicate clearly that the S3 subsite of the SARS CoVMpro has a negative character. The hypothesis proposed that Glu47 is an important residue in the S3 subsite for binding with P3Lys of the octapeptide. The electrostatic interactions between Glu47 and P3Lys play a key role in specific binding. These observations are very important and provide further information for structural-based drug design against SARS virus.
List of abbreviations used
Groningen Machine for Chemical Simulations
Protein Data Bank
Root Mean Square Deviation
Root Mean Square Fluctuation
- SARS CoVMpro:
SARS Coronavirus MainProteinase
Simple Point Charges
Transmissible GastroEnteritis Virus
Visual Molecular Dynamics.
This research was supported by the Thailand Research Fund under a grant of the Royal Golden Jubilee Ph.D. program to K. P. (Grant No. PHD/0172/2547) and a grant from the medicinal chemistry project (Basic research grant number DBG4880009). We would also like to thank Dr. Kiattawee Choowongkomon from Department of Biochemistry, Faculty of Science, Kasetsart University for providing the GROMACS methodology and Mr. Patarapon Juntrapon, Ms. Manussada Ratanasak from Thai Equipment Research Co., Ltd. for providing the Accelrys Discovery Studio 2.0.1 program.
This article has been published as part of BMC Bioinformatics Volume 10 Supplement 1, 2009: Proceedings of The Seventh Asia Pacific Bioinformatics Conference (APBC) 2009. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/10?issue=S1
- Lee N, Hui D, Wu A, Chan P, Cameron P, Joynt GM, Ahuja A: A major outbreak of severe acute respiratory syndrome in Hong Kong. N Engl J Med 2003, 348: 1986–94. 10.1056/NEJMoa030685View ArticlePubMedGoogle Scholar
- Leng Q, Bentwich Z: A novel coronavirus and SARS. N Engl J Med 2003, 349: 709. 10.1056/NEJMc031427View ArticlePubMedGoogle Scholar
- Marra MA, Jones SJ, Astell CR, Holt RA, Brooks-Wilson A, Butterfield YS, Khattra J: The Genome sequence of the SARS-associated coronavirus. Science 2003, 300: 1399–1404. 10.1126/science.1085953View ArticlePubMedGoogle Scholar
- Poutanen SM, Low DE, Henry B, Finkelstein S, Rose D, Green K, Tellier R: Identification of severe acute respiratory syndrome in Canada. N Engl J Med 2003, 348: 1995–2005. 10.1056/NEJMoa030634View ArticlePubMedGoogle Scholar
- Rota PA, Oberste MS, Monroe SS, Nix WA, Campagnoli R, Icenogle JP, Penaranda S: Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science 2003, 300: 1394–99. 10.1126/science.1085952View ArticlePubMedGoogle Scholar
- Thiel V, Ivanov KA, Putics A, Hertzig T, Schelle B, Bayer S, Weissbrich B: Mechanisms and enzymes involved in SARS coronavirus genome expression. J Gen Virol 2003, 84: 2305–15. 10.1099/vir.0.19424-0View ArticlePubMedGoogle Scholar
- Anand K, Ziebuhr J, Wadhwani P, Mesters JR, Hilgenfeld R: Coronavirus main proteinase (3CLpro) structure: basis for design of anti-SARS drugs. Science 2003, 300: 1763–67. 10.1126/science.1085658View ArticlePubMedGoogle Scholar
- Wu CY, Jan JT, Ma SH, Kuo CJ, Juan HF, Cheng YS, Hsu H: Small molecules targeting severe acute respiratory syndrome human coronavirus. Proc Natl Acad Sci USA 2004, 101: 10012–17. 10.1073/pnas.0403596101PubMed CentralView ArticlePubMedGoogle Scholar
- Hegyi A, Ziebuhr J: Conservation of substrate specificities among coronavirus main proteases. J Gen Virol 2002, 83: 595–99.View ArticlePubMedGoogle Scholar
- Yang H, Yang M, Ding Y, Liu Y, Lou Z, Zhou Z, Sun L: The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor. Proc Natl Acad Sci USA 2003, 100: 13190–95. 10.1073/pnas.1835675100PubMed CentralView ArticlePubMedGoogle Scholar
- Huang C, Wei P, Fan K, Liu Y, Lai L: 3C-like Proteinase from SARS Coronavirus Catalyzed Substrate Hydrolysis by a General Base Mechanism. Biochemistry 2004, 43: 4568–74. 10.1021/bi036022qView ArticlePubMedGoogle Scholar
- Shan YF, Xu GJ: Study on substrate specificity at subsites for severe acute respiratory syndrome coronavirus 3CL proteinase. Acta Biochim Biophys Sin 2005, 37: 807–13. 10.1111/j.1745-7270.2005.00114.xView ArticlePubMedGoogle Scholar
- Fan K, Ma L, Han X, Liang H, Wei P, Liu Y, Lai L: The substrate specificity of SARS coronavirus 3C-like proteinase. Biochem Biophys Res Commun 2005, 329: 934–40. 10.1016/j.bbrc.2005.02.061View ArticlePubMedGoogle Scholar
- Sanghiran Lee V, Wittayanarakul K, Remsungnen T, Parasuk V, Sompornpisut P, Chantratita W, Sangma C: Structure and dynamics of SARS coronavirus proteinase: The primary key to the designing and screening for anti-SARS drugs. Science Asia 2003, 29: 181–88. 10.2306/scienceasia1513-1874.2003.29.181View ArticleGoogle Scholar
- Liu HL, Lin JC, Ho Y, Hsieh WC, Chen CW, Su YC: Molecular dynamics simulations of various coronavirus main proteinases. J Biomol Struct Dyn 2004, 22: 65–77.View ArticlePubMedGoogle Scholar
- Insight II and Discovery studio[http://accelrys.com/products/insight]
- Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ: Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comput Chem 1998, 19: 1639–62. 10.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-BView ArticleGoogle Scholar
- Lindahl E, Hess B, Spoel D: GROMACS: a package for molecular simulation and trajectory analysis. J Mol Model 2001, 7: 306–17.Google Scholar
- Chou KC, Wei DQ, Zhong WZ: Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS. Biochem Biophys Res Commun 2003, 308: 148–151. 10.1016/S0006-291X(03)01342-1View ArticlePubMedGoogle Scholar
- Du Q, Wang S, Wei D, Sirois S, Chou KC: Molecular modeling and chemical modification for finding peptide inhibitor against severe acute respiratory syndrome coronavirus main proteinase. Anal Biochem 2005, 337: 262–70. 10.1016/j.ab.2004.10.003View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.