Modelling the structures of G protein-coupled receptors aided by three-dimensional validation
BMC Bioinformatics volume 9, Article number: S14 (2008)
G protein-coupled receptors (GPCRs) are abundant, activate complex signalling and represent the targets for up to ~60% of pharmaceuticals but there is a paucity of structural data. Bovine rhodopsin is the first GPCR for which high-resolution structures have been completed but significant variations in structure are likely to exist among the GPCRs. Because of this, considerable effort has been expended on developing in silico tools for refining structures of individual GPCRs. We have developed REPIMPS, a modification of the inverse-folding software Profiles-3D, to assess and predict the rotational orientation and vertical position of helices within the helix bundle of individual GPCRs. We highlight the value of the method by applying it to the Baldwin GPCR template but the method can, in principle, be applied to any low- or high-resolution membrane protein template or structure.
3D models were built for transmembrane helical segments of 493 GPCRs based on the Baldwin template, and the models were then scored using REPIMPS and Profiles-3D. The compatibility scores increased significantly using REPIMPS because it takes into account the physicochemical properties of the (lipid) environment surrounding the helix bundle. The arrangement of helices in the helix bundle of the 493 models was then altered systematically by rotating the individual helices. For most GPCRs in the set, changes in the rotational position of one or more helices resulted in significant improvement in the compatibility scores. In particular, for most GPCRs, a rotation of helix VII by 240–300° resulted in improved scores. Bovine rhodopsin modelled using this method showed 3.31 Å RMSD to its crystal structure for 198 Cα atom pairs, suggesting the utility of the method even when starting with idealised structures such as the Baldwin template.
We have developed an in silico tool which can be used to test the validity of, and refine, models of GPCRs with respect to helix rotation and vertical position based on the physicochemical properties of amino acids and the surrounding environment. The method can be applied to any multi-pass membrane protein and potentially can be used in combination with other high-throughput methodologies to generate and refine models of membrane proteins.
G protein-coupled receptors (GPCRs) are a family of integral-membrane proteins (IMPs) that transduce chemical and optical signals through the cell membrane  leading to the activation of G proteins, which in turn trigger a wide range of biological events . GPCRs share a conserved structure consisting of seven transmembrane α-helices, as determined by a variety of methodologies including electron cryo-microscopy [3, 4] and X-ray diffraction [5, 6]. Detailed structural knowledge of GPCR structure is of interest in part because they are prime targets for therapeutic agents . A number of web sites provide theoretical models and other information on GPCRs. For example, at GPCRDB http://www.gpcr.org/7tm/, diverse data on GPCRs, including close to 2000 structural models, have been collected and organized .
Several modelling approaches have been used to construct three-dimensional models of GPCRs and can be classified broadly into two categories: those using structural templates [9–11] and those using de novo approaches [12–15].
Low- and high-resolution structures of bacteriorhodopsin (BR) have been used as templates for modeling the structures of GPCRs because of the seven transmembrane regions and the similarity of mechanism of activation of BR to that of rhodopsin. However, there are many assumptions inherent in the use of BR as a template. For example, although BR and rhodopsin are activated by light, BR functions as a proton pump [16, 17] whilst rhodopsin is coupled to a G protein . The sequence identity between BR and rhodopsin is low (12.8%) and a comparison of the high-resolution structures of BR and rhodopsin reveals different helix bundle arrangements . Modelling of GPCRs based on alignment with the structure of BR may, therefore, be error prone . Nevertheless, many 3D models of GPCRs have been generated based on the structure of BR, such as those of the receptors for dopamine, adrenalin, serotonin, acetylcholine [21–23], vasopressin V2 , opioids [13, 24], guanine nucleotide-binding regulatory protein , human thromboxane A2 , 5-HT2B  and galanin [10, 28].
The crystal structure of bovine rhodopsin, solved to a resolution of 2.8 Å, represents the first high-resolution structure of a GPCR [5, 6]. Since then, this crystal structure has been used as a template for modelling other GPCRs  on the basis that the structure of rhodopsin represents a consensus template.
Lower-resolution templates have also been used to model GPCRs. The template developed by Baldwin et al. based on the electron density map of frog rhodopsin [3, 4] includes the Cα positions of the 7 transmembrane helices as well as their extensions beyond the membrane on both sides. The sequences of 493 GPCRs were then examined using a consensus approach, based on residue conservation and hydrophobicity analysis of amino acids, and projected into the plane of the membrane to postulate several structural features of the family, including the location of the transmembrane segments within a sequence, transmembrane lengths and extensions beyond the membrane, and orientations of the helices with respect to one another. Strahs and Weinstein have modelled opioid receptors using comparative and molecular dynamics studies in which the transmembrane helix bundles were assembled on this Baldwin template . Luteinizing hormone , α1b-adrenergic  and type one thyrotropin-releasing hormone  receptors are among other GPCR models based on the Baldwin template. Rubenstein et al. studied the mechanism of activation for β2-adrenergic receptor using molecular dynamics techniques and a biophysical model based on the Baldwin template . The template continues to be useful as a starting point for modeling GPCRs even with the release of the crystal structure of bovine rhodopsin (see  and references cited therein).
However, no single template appears to be appropriate for modeling the structures of all GPCRs. For example, the use of the high-resolution structure of rhodopsin as a template has recently been questioned: the model of CCK1 receptor built based on the rhodopsin structure was unable to reproduce the experimentally observed interactions between the ligand (CCK) and the receptor model in docking approaches . Similarly, for the Baldwin template, 'conserved' residues for a particular GPCR are not always present and often there is no obvious cluster of hydrophobic residues on one side of the helices to help locate them either vertically with respect to the membrane or with respect to rotation. Thus, the available data suggest that the organization of the transmembrane components of GPCRs is dictated by more considerations than contained in the currently available templates, regardless of the resolution. This is not surprising, given that sequence conservation among GPCRs can be low, they have adapted to bind a large range of ligand types and sizes, and nonidealities in the structure of transmembrane segments such as kinks, unwindings and tightenings are likely to be, in many cases, GPCR-specific.
Several inverse-folding methodologies have been developed to model the three-dimensional structures of proteins. These methods are based on physicochemical, as opposed to sequence homology, considerations and use potential functions frequently involving pairwise amino-acid interaction, solvent exposure, and local secondary structure. Based on these criteria, the probability of finding specific residues in a particular class of environment can be estimated. The string of residues of the protein is thus converted to a string of environment classes from which compatible structures can be generated.
Reverse-environment prediction of IMP structure (REPIMPS)  is a modification of the Profiles-3D application, an inverse-folding methodology appropriate for water-soluble proteins [37, 38]. The modification accounts for the fact that sidechains of many residues in IMPs are in contact with lipid rather than water. The correction ensures that lipid-exposed residues are appropriately classified with respect to their physicochemical environment. As a result, compatibility scores calculated using REPIMPS for IMPs whose structures have been solved improve significantly over those calculated using Profiles-3D, and there is a reduced possibility of rejecting a 3D model of an IMP because the presence of a lipid environment was not included . REPIMPS has been used to locate the transmembrane segment in IMPs with a single transmembrane domain, has the potential to locate transmembrane segments in IMPs with multiple transmembrane domains, and can be used to assess if transmembrane segments are appropriately oriented with respect to the lipid environment and surrounding transmembrane domains .
We highlight the value of the REPIMPS method by applying it to models of GPCRs generated from an idealised template, the Baldwin template, to test the validity of, and refine, the models with respect to helix rotation and vertical position. The method can, in principle, be applied to any low- or high-resolution GPCR template or structure, or to any multi-pass membrane protein, and potentially can be used in combination with other high-throughput methodologies to generate and refine models of IMPs.
Large-scale comparative modelling of GPCRs based on the Baldwin template and calculation of lipid-corrected compatibility scores and CAD values
Three-dimensional models were built for the 493 GPCRs in the database used by Baldwin et al. , which contains the coordinates of the Cα atoms predicted to be part of the transmembrane segments and their helical continuations at both sides of the membrane. Side-chain positions were refined as outlined in Methods.
For the 493 GPCR models, compatibility scores were calculated using Profiles-3D, which assumes an aqueous environment (Figure 1A). The compatibility scores were also calculated using REPIMPS , which assumes that atoms in contact with the membrane are in a hydrophobic environment (Figure 1A). The average lipid-corrected compatibility score using REPIMPS was 94 compared to an average score of 52 calculated using Profiles-3D. The level of improvement was not the same for all models. Figure 1B shows the distribution of the improvement in the compatibility scores for individual GPCRs calculated using REPIMPS versus the value calculated using Profiles-3D. Scores were also compared for individual helices as part of the whole model (Fig. 1C). For each of helices I–VII of the 493 GPCR models, the mean lipid-corrected compatibility scores, as calculated by REPIMPS, were significantly higher (p < 0.001) than the mean scores calculated using Profiles 3D, as determined by paired t test.
In order to evaluate the arrangement of the helices in the helix bundle of the 493 models generated based on the Baldwin template, the model structures were altered systematically by rotating the individual helices one at the time by 30° about the helix long axes. For each rotation, the change in the lipid-corrected compatibility score was calculated using REPIMPS for all 493 GPCR models. The values were then averaged and normalized against the average score for the unchanged models (Figs 2A,B).
To analyse the degree of structural difference between consecutive pairs of models after rotation of each of the helices, contact area difference (CAD) values were calculated  and the average value for all models plotted against the rotation of individual helices (Fig. 2C). For example, helix I of the model for a particular GPCR was rotated from 0° to 30°, and then a CAD comparison was performed between them. The results of the CAD calculation for this change in the models of all 493 GPCRs were then averaged and plotted at 0°.
REPIMPS-guided modelling of bovine rhodopsin and hGalR1
There is good agreement between the transmembrane regions of bovine rhodopsin determined from the crystal structure and the Baldwin template (Table 1): Superposition of the model for bovine rhodopsin derived from Baldwin template with the equivalent residues in the crystal structure gave an RMSD of 3.2 Å for the 198 Cα atoms, which suggests very similar arrangement of the helices . For individual helices, the RMSDs were largely due to nonidealities (unwindings, tightenings and kinks), translation perpendicular to the membrane, and helix rotation up to ~30° . Note that a small amphipathic helix is seen in the crystal structure following helix VII which is not present in the Baldwin template (Table 1). This helix is not predicted to be transmembrane but rather to lie on the cytoplasmic face of the membrane bilayer.
The strong similarity between the crystal structure of bovine rhodopsin and the model derived from the Baldwin template suggests the template remains a useful starting point for further refinement of the structures of individual GPCRs. Indeed, the presence of nonidealities present in the crystal structure may be, in some cases a disadvantage since these are likely to be GPCR-specific.
We examined several in silico tools used to predict the position and number of transmembrane segments of IMPs and compared the results when these tools were applied to the sequence for the GPCR, hGalR1 (Table 1). The exact locations, length and number of transmembrane segments predicted by the different tools varied. For example, TopPred predicted 8 transmembrane segments for hGalR1 and the predicted location of Helix VII using the various methods was, in some cases, mutually exclusive. The hydrophobicity and hydrophobicity moments of the transmembrane segments of hGalR1 proposed by Baldwin et al.  were also calculated (Table 2) . The values of the moments were small, and consistent with values estimated for other IMPs .
Next, we used REPIMPS to consider a series of possible alignments of just the transmembrane segments of hGalR1 mapped onto the Baldwin template. The sequences used for the model building cover the sequence alignment postulated by Baldwin and four alternatives deviating from the Baldwin alignment by frameshifting the sequence by up to two positions in either direction, as shown for Helix I of hGalR1 in Table 3. This effectively generated models in which the helices were rotated along the helix axis and translated up and/or down relative to the other helices in the bundle. Using all possible combinations of the different sequences defined for the seven helices of hGalR1, a total of 78,125 models were generated.
The lipid-corrected compatibility score for the Baldwin consensus model of hGalR1, as determined by REPIMPS, was 85.5. Fig. 3A shows the distribution of scores for all 78,125 models. The mean score was 88.5 ± 5.9σ. 1934 models had scores greater than two standard deviations above this mean (values > 100.13) and were subjected to further structure refinement by searching for energetically favourable rotamers for the side chains. Fig. 3B compares the lipid-corrected compatibility scores for these models before and after structure refinement. On average, structure refinement reduced the scores by a mean value of 2.6 (Fig. 3C). The sequences of the transmembrane helices for the model of hGalR1 with the highest lipid-corrected compatibility score following structure refinement are shown in Table 4 and compared with sequences derived from the Baldwin template for this receptor.
The model building method and identification of the transmembrane helices using the abovementioned procedures were also applied to bovine rhodopsin, the first GPCR for which there is a crystal structure. Using REPIMPS, the Baldwin model of this protein had a lipid corrected compatibility score of 85.6 and was not among the top scoring models, whilst the best model amongst the 78,125 models generated had a REPIMPS score of 107.8 and showed deviations from the model proposed by Baldwin (Table 4).
GPCRs are signalling molecules that traverse the cell membrane with seven helices in an anticlockwise progression as viewed from outside the cell. Though much evidence suggested this overall architecture [3, 4], it was only with the first reported crystal structure of a GPCR that this was confirmed [5, 6]. GPCRs bind ligands ranging from small molecules to large proteins, indicating that details of their architecture must deviate. Furthermore, additional factors that affect the structure and function of GPCRs, such as dimerisation and interactions with associated proteins, have been reported [42, 43]. The overall result is that no single in vivo mechanism of a GPCR has been fully characterised at the structural level.
GPCRs are considered non-standard proteins based on the most applicable methods of structure determination , and so it is not expected that high-resolution structural data will accrue rapidly for this class of protein in the near future. For this reason, comparative protein modelling methods, which assume that a single template structure is appropriate for all members of a family, remain an important approach in modelling the structures of GPCRs and indeed all other families of IMP. However, these templates, whether high- or low-resolution, should only be regarded as a starting point for determining the unique structural properties of individual members within the family. Thus, in silico tools, such as those used to predict the location of transmembrane segments and to refine the structural features of the template, also remain an important feature of predictive modelling for IMPs.
We applied the REPIMPS methodology to a well-known template for GPCRs, the Baldwin template, to indicate that, for individual GPCRs, the rotational position of helices and their vertical positioning may differ significantly from the template. This in silico tool compares favourably with other tools in terms of predicting the location of transmembrane helices  (Tables 1, 4) and can, in principle, be applied to templates from any IMP family, including, for example, the high-resolution structure of the GPCR, bovine rhodopsin. The fact that so few high-resolution structures have been determined for GPCRs indicates that low-resolution templates, such as the Baldwin template, will continue to play a role in predictive modelling of individual GPCRs. In addition, we have shown previously that the crystal structure of bovine rhodopsin and the model of this protein derived from the Baldwin template show overall structural similarity (3.2 Å RMSD) . Because of this, it is likely that both the high-resolution crystal structure of bovine rhodopsin and the Baldwin template remain valid starting points for building models of individual GPCRs with the aid of in silico tools such as REPIMPS.
Starting with either the crystal structure or the Baldwin template is likely to have its advantages and disadvantages. For example, the nearly idealised helices of the low-resolution template may prove useful in some respects since they lack the nonidealities of the crystal structure, such as localised unwindings, tightenings and kinks, which may well be GPCR-specific. Alternatively, it may prove useful in some cases to apply REPIMPS to models derived from the high-resolution structure in which the effects of the documented nonidealities and in silico mutations can be assessed.
Improvement of the compatibility score of the Baldwin template for GPCRs using REPIMPS
We built 493 models of GPCRs based on the Baldwin template  and assessed them with the REPIMPS methodology, which unlike Profiles-3D from which it was derived, takes into consideration that sidechains of many residues in IMPs are in contact with lipid rather than water . REPIMPS improved the average compatibility score for all 493 GPCR models to 94, compared to a value of 52 obtained with Profiles-3D (Figure 1A). The greatest improvement was seen for helices I and V which have a greater area exposed to the lipid membrane in the Baldwin template (Figure 1C). Similarly, the lowest improvement of compatibility scores was observed for helices III and VII which have the smallest area exposed to lipid. We previously demonstrated the existence of the correlation between area exposed to the solvent and the extent of the improvement of compatibility scores for a set of IMP structures .
The effects of the rotation of helices about their axis show that there are rotational steps for which the lipid-corrected compatibility score is significantly higher than that at the origin (zero rotation) (Figs 2A,B). A higher value for a rotated helix compared to that calculated for the Baldwin template at zero rotation is an indication that an alternate position for the helix is available which positions the side chains in a more compatible environment within the bilayer. This is most evident for helix VII, where it appears that for most GPCRs a rotation of 240–300° relative to the Baldwin template position is the preferred orientation. Alternatively, this may not be the consequence of misorientation of helix VII in the Baldwin template, but rather from nonidealities similar to those seen in the crystal structure of bovine rhodopsin [5, 6]. Nevertheless, the REPIMPS approach suggests deficiencies in the Baldwin template purely using a molecular modelling approach based on placing IMPs in the correct (lipid) environment.
Most of the commonly used methods to evaluate the difference between a model and a reference structure are based on calculating RMSD values. However, the structural changes that we have applied to the models of GPCRs, namely the rotation of the helices, require the analysis of structural differences which are not dependent on the geometrical changes of the structure. For this reason, we used the CAD method  which measures a normalised sum of absolute differences of residue-residue contact surface areas calculated for a reference structure and a model.
The average of the CAD values were used to compare a pair of models, which differed by the rotation of a single helix by 30° about the helix axis. The CAD values for each helix for 493 models were plotted against the total rotation from the starting model as shown in Figure 2C. As is clear from the figure, rotation of the helices about the helix axis did not produce major fluctuations in the values. The maximum CAD value was ~7, seen for rotation of helix III. This is in the range observed for the differences between different models of a protein derived from structures solved using NMR techniques . Because the relevance of the absolute values of the CAD method is difficult to determine, we were most interested relative changes. The calculations suggested there were no rotations that produced unrealistic changes.
For a more reasonable assessment of the differences between the models as the result of rotation of the helices, the CAD values were calculated for just that part of the model in which changes were made. In the case of bovine rhodopsin, the model was truncated to contain just the helix being rotated as well as those helices which are in contact with the rotated helix. In this way the average CAD value for helix I increased from 6.1 ± 0.3 to 9.4 ± 0.7. In a similar way, the CAD value as a result of rotation of helix III increased from 6.9 ± 0.3 to 9.5 ± 0.3. These higher CAD values are also in the range that one can expect for the differences between the models of a protein built based on the data from NMR spectroscopy. Overall, the process of rotating helices in GPCRs does not greatly affect CAD values. This provides justification for rotating helices and using REPIMPS to assess the quality of models and the appropriate orientation of helices.
Modelling of bovine rhodopsin and hGalR1 based on the Baldwin template using REPIMPS and different alignments of transmembrane segments
We generated 78,125 models for both bovine rhodopsin and hGalR1 built using 5 different threads of sequences originating from each of the transmembrane regions. The models were then scored using the REPIMPS algorithm.
Figure 3A shows the distribution of the lipid-corrected compatibility scores for the generated models of hGalR1. A total of 1934 models with the score greater than the mean lipid-corrected compatibility score plus two standard deviations (>100.13) were subjected to more structure refinement by searching energetically more favourable rotamers for the side chains. The model representing the alignment derived using the Baldwin template was also included in the structure refinement step despite its low lipid-corrected compatibility score of 85.5. The sequences of the helices with the highest lipid-corrected compatibility score obtained by REPIMPS are shown in Table 4 (set A) along with the sequences for hGalR1 derived from the Baldwin template (set B). The two sets are identical for helices II, III and VI. Helix VII shows a one residue shift, while Helices I, IV and V are different by a two residue shift.
The validity of the models could be further tested by experimental means such as site-directed mutagenesis. There is evidence that Helix III [5, 6], residues at the top of Helices IV and VII  and residues His264 and His267 in Helix VI [10, 28] are important for galanin binding. Both the model generated directly from the Baldwin template and the 'refined' model generating using REPIMPS have the His residues positioned inside the helix bundle.
With respect to bovine rhodopsin, the REPIMPS approach identified a set of sequences for the transmembrane segments that are different from those for the model generated from the Baldwin template (Table 4). The REPIMPS-based model built from these sequences shows an RMSD of 3.31 Å to the crystal structure for 198 Cα atom pairs of the residues indicated in Table 4. This RMSD value is comparable to that obtained for the model of rhodopsin proposed by Shacham (2.9 Å) , Baldwin (3.2 Å) , and Yarov-Yarovoy (3.8 Å for 91 residues) . In our model Lys296 is moved 2.9 Å down in the helix axis and faces toward the binding pocket of the retinal molecule created by helices 3–7. In the Baldwin model, Lys296 is facing helix 1, in a direction opposite to the binding pocket. The differences between our model and the crystal structure of bovine rhodopsin may be indicative of deficiencies in our method. However, the crystal structure represents just one form (the inactive form) of this receptor .
REPIMPS can be used as an in silico tool to assist in the modelling positional features of transmembrane segments of IMPs. The method can, in principle, be applied to any template for GPCRs as well as templates for other families of IMP. Here, we have applied REPIMPS to the Baldwin template of GPCRs and shown that, individually and collectively, vertical positioning and rotational orientation of the transmembrane helices can differ significantly from the template.
GPCR sequences, 3D models and programs
Calculations were performed on a Silicon Graphics O2 workstation (SGI, Mountain View, CA, USA) using the InsightII molecular modelling package (v98.0, Molecular Simulations, San Diego, CA, USA, now available from Accelrys, San Diego, CA, USA). The Baldwin model of bovine rhodopsin, consisting Cα atoms in transmembrane helices and their extra-membrane extensions and the sequences of the predicted transmembrane regions of 493 GPCRs aligned based on the known GPCRs footprint residues, were obtained from Dr J. Baldwin . Additional File 1 contains the details of the GPCRs modelled in this study. The crystallographic structure of rhodopsin was obtained from the Protein Data Bank at the Research Collaboratory for Structural Bioinformatics . As an alternative to the RMSD comparison, the ICMlite program (v 2.8 2000, MolSoft L.L.C., La Jolla, CA) was used to calculate contact area difference (CAD)  between a pair of models and/or structures of the receptors. Secondary structure conformations were identified using the Kabsch-Sander method . Use of the Profiles-3D program and REPIMPS method were as described previously . Briefly, for IMPs a significant proportion of the residues are in contact with lipids of membrane. Thus, by considering the surrounding apolar environment, the correct values of the area of the sidechain buried away from the aqueous phase (A) and the area in contact with polar atoms (F) were calculated by modifying the Profiles-3D program. By using the F, A and local secondary structure of each residue located within the membrane of an IMP, the appropriate environmental class for each residue from the 18 environmental classes [38, 50] is assigned, and accordingly the appropriate compatibility score for the residue is registered. The compatibility score for a protein/model structure or any part of the structure is the sum of the compatibility scores for the comprising residues.
Automated comparative modelling of GPCRs based on the Baldwin Cα template for bovine rhodopsin
Model-building functions were written in Unix: transmembrane sequences of all 493 GPCRs were extracted sequentially from the files holding sequence alignments, resulting in a sequence file with seven lines corresponding to seven transmembrane segments for each GPCR. The sequence file was automatically used to build a model based on the Cα template for the transmembrane helices of bovine rhodopsin postulated by Baldwin et al. . In this model-building procedure, a polyalanine polypeptide which was created based on the coordinates of the Cα atoms in the Baldwin model was used as the template for modelling the 493 GPCRs using the PROTEIN/BACKBONE command in InsightII. Side chains were positioned using the rotamer library  starting with bulky side chains: Residues Trp, Tyr, Phe, Ile, Met and Val, in that order, were considered to have moving side chains, and then the side chain rotamer search for remaining residues was applied. The 'best' rotamer was selected for the first residue in the list based on energy criteria (i.e., the lowest energy). Then, the best rotamer was selected for the next moving side chain, and so on. A cycle was defined as one complete pass through the list. The search terminated when the energy changed ≤0.05 kcal/mol from one cycle to the next, as defined by the CONVERGENCE parameter. Usually 3–4 cycles of rotamer search and energy calculations were required.
Calculating lipid corrected compatibility scores (using REPIMPS) and CAD values
After all 493 models were built, their self compatibility scores calculated by Profiles-3D and their lipid-corrected compatibility scores were calculated using REPIMPS . For each model, using the InsightII command line, each helix was subjected to 12 fixed rotations of 30° about the helix long axis using the automated Unix script. At each rotation, the side chains were repositioned according to the procedure outlined above and self compatibility and lipid-corrected compatibility scores recalculated. In all, 41,412 model structures were generated.
As part of the evaluation of differences between a pair of models or structures of the same receptor protein, we used ICMlite to calculate the CAD value to measure geometrical differences between two different conformations of the same molecule. CAD, as opposed to RMSD, is contact based and can measure the difference between 3D models with a wide range of accuracy . We used this method to assess the difference between models before and after a single step of rotation of an individual helix. For example, the model generated after a 30° rotation of Helix I about its long axis was compared with the original model in which Helix I was not rotated. Unix and ICM language scripts were used to automate the calculation procedure. For bovine rhodopsin, the CAD calculation was also carried out using rotational steps of 20° and 10° intervals. In addition, for bovine rhodopsin, the CAD values as the result of rotation of Helices I and III were recalculated in a different way by ignoring all other helices except those in contact with the helix under rotation; i.e., Helices II and VII for Helix I, and the Helices II and IV in the case of Helix III.
Modelling bovine rhodopsin and hGalR1 based on the Baldwin template and verification of the models using REPIMPS
Transmembrane segments of bovine rhodopsin and hGalR1
Transmembrane segments of bovine rhodopsin (shown in Table 1) are based on the assignment indicated in the Protein Databank (RCSB) for the 1F88 A crystal structure and transmembrane segments proposed by Baldwin. Transmembrane segments of hGalR1 were predicted using the methods listed in Table 1. These methods were used from their web interfaces accepting the default settings for all the parameters. The results of predictions of transmembrane segments were compared with that postulated by Baldwin et al. .
Model-building procedure and REPIMPS calculation
The following procedure explaining the method in the case of hGalR1 is the same as that used for bovine rhodopsin modelling. For each helix, five different sequences were used to build the helix. These five sequences were taken from same region of hGalR1 and they had the same length. For example, Helix I was represented by five sequences, each consists of 27 residues, with the first sequence starting at Glu32 and ending at Leu58, the second sequence starting at Asn33 and ending at Ala59, the third sequence starting at Phe34 and ending at Arg60, etc. The third sequence in each set of five was the same as the sequence proposed by Baldwin et al.  for that helix. Note that for the first sequence, for example, residues 32–58 were given the same Cα coordinates as residues 34–60 of the third sequence. Effectively this led to having five helices of the same length and coordinates but different in residue composition, from which only one was used in each cycle of model building. In each cycle of building the model of hGalR1, one helix was selected from each of seven sets of helices to build a complete seven-helix bundle model which had the same Cα coordinates as the Baldwin template for GPCRs. Side-chain refinement was not included in the procedure. The lipid-compatibility score of the generated model was calculated based on the REPIMPS method. This cycle was repeated in an automated manner until all 78,125 combinations of the helices were used.
Refinement of the models of bovine rhodopsin and hGalR1
The models with a lipid-corrected compatibility score greater than two standard deviations above the mean lipid-corrected compatibility score calculated for all 78,125 models were selected for further refinements. Side chains were positioned in the energetically most favourable state by searching a side-chain rotamer library and calculation of the energy for the models as outlined above.
The hydrophobicity moments for the transmembrane segments were calculated using the Moment program . The numerical values of the hydrophobicities used in these calculations were from the consensus scale of Eisenberg et al. , which have been normalised so that the mean value of the hydrophobicities was zero with standard deviation of unity .
Reverse-Environment Prediction of Integral Membrane Protein Structure
Contact Area Difference
Root Mean Square Difference
Lomize AL, Pogozheva ID, Mosberg HI: Structural organization of G-protein-coupled recetors. J Comput Aided Mol Des 1999, 13: 325–353. 10.1023/A:1008050821744
Fotiadis D, Jastrzebska B, Philippsen A, Muller DJ, Palczewski K, Engel A: Structure of the rhodopsin dimer: a working model for G-protein-coupled receptors. Curr Opin Struct Biol 2006, 16: 252–259. 10.1016/j.sbi.2006.03.013
Baldwin JM, Schertler GFX, Unger VM: An alpha-carbon template for the transmembrane helices in the rhodopsin family of G-protein-coupled Receptors. J Mol Biol 1997, 272: 144–164. 10.1006/jmbi.1997.1240
Unger VM, Hargrave PA, Baldwin JM, Schertler GF: Arrangement of rhodopsin transmembrane α-helices. Nature 1997, 389: 203–206. 10.1038/38316
Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Trong IL, Teller DC, Okada T, Stenkamp RE, et al.: Crystal structure of rhodopsin: a G protein-coupled receptor. Science 2000, 289: 739–745. 10.1126/science.289.5480.739
Teller DC, Okada T, Behnke CA, Palczewski K, Stenkamp RE: Advances in determination of a high-resolution three-dimensional structure of rhodopsin, a model of G-protein-coupled receptors (GPCRs). Biochemistry 2001, 40: 7761–7772. 10.1021/bi0155091
Wise A, Gearing K, Rees S: Target validation of G-protein coupled receptors. Drug Discovery Today 2002, 7: 235–246. 10.1016/S1359-6446(01)02131-6
Horn F, Bettler E, Oliveira L, Campagne F, Cohen FE, Vriend G: GPCRDB information system for G protein-coupled receptors. Nucleic Acids Res 2003, 31: 294–297. 10.1093/nar/gkg103
Czaplewski C, Kazmierkiewicz R, Ciarkowski J: Molecular modeling of the human vasopressin V2 receptor/agonist complex. J Comput Aided Mol Des 1998, 12: 275–287. 10.1023/A:1007969526447
Kask K, Berthold M, Kahl U, Jureus A, Nordvall G, Langel U, Bartfai T: Mutagenesis study on human galanin receptor GalR1 reveals domains involved in ligand binding. Ann N Y Acad Sci 1998, 863: 78–85. 10.1111/j.1749-6632.1998.tb10685.x
Strahs D, Weinstein H: Comparative modeling and molecular dynamics studies of the δ, κ and μ opioid receptors. Protein Eng 1997, 10: 1019–1038. 10.1093/protein/10.9.1019
Filizola M, Perez JJ, Carteni-Farina M: BUNDLE: a program for building the transmembrane domains of G-protein-coupled receptors. J Comput Aided Mol Des 1998, 12: 111–118. 10.1023/A:1007969112988
Filizola M, Laakkonen L, Loew GH: 3D modeling, ligand binding and activation studies of the cloned mouse δ, μ and κ opioid receptors. Protein Eng 1999, 12: 927–942. 10.1093/protein/12.11.927
MaloneyHuss K, Lybrand TP: Three-dimensional structure for the β 2 adrenergic receptor protein based on computer modeling studies. J Mol Biol 1992, 225: 859–871. 10.1016/0022-2836(92)90406-A
Yarov-Yarovoy V, Schonbrun J, Baker D: Multipass membrane protein structure prediction using Rosetta. Proteins 2006, 62: 1010–1025. 10.1002/prot.20817
Lewis A, Rousso I, Khachatryan E, Brodsky I, Lieberman K, Sheves M: Directly probing rapid membrane protein dynamics with an atomic force microscope: a study of light-induced conformational alterations in bacteriorhodopsin. Biophys J 1996, 70: 2380–2384.
Luecke H, Richter HT, Lanyi JK: Proton transfer pathway in bacteriorhodopsin at 2.3 angstrom resolution. Science 1998, 280: 1934–1937. 10.1126/science.280.5371.1934
Stenkamp RE, Teller DC, Palczewski K: Crystal structure of rhodopsin: A G-protein-coupled receptor. Chembiochem 2002, 3: 963–967. 10.1002/1439-7633(20021004)3:10<963::AID-CBIC963>3.0.CO;2-9
Riek RP, Rigoutsos I, Novotny J, Graham RM: Non-α-helical elements modulate polytopic membrane protein architecture. J Mol Biol 2001, 306: 349–362. 10.1006/jmbi.2000.4402
Donnelly D, Findlay BC: Seven-helix receptors: structure and modelling. Curr Opin Struct Biol 1994, 4: 582–589. 10.1016/S0959-440X(94)90221-6
Hibert MF, Trumpp-Kallmeyer S, Bruinvels A, Hoflack J: Three-dimensional models of neurotransmitter G-binding protein-coupled receptors. Mol Pharmacol 1991, 40: 8–15.
Hutchins C: Models of ligand binding and receptor function of the G protein-coupled receptors. Alfred Bennzon Symposium: Copenhagen; 1996:213–226.
Trumpp-Kallmeyer S, Hoflack J, Bruinvels A, Hibert M: Modeling of G-protein-coupled receptors: application to dopamine, adrenaline, serotonin, acetylcholine, and mammalian opsin receptors. J Med Chem 1992, 35: 3448–3462. 10.1021/jm00097a002
Liu DX, Jiang HL, Shen JS, Zhu WL, Zhao L, Chen KX, Ji RY: Molecular modeling on kappa opioid receptor and its interaction with nonpeptide kappa opioid agonists. Zhongguo Yao Li Xue Bao 1999, 20: 131–136.
Pardo L, Ballesteros JA, Osman R, Weinstein H: On the use of the transmembrane domain of bacteriorhodopsin as a template for modeling the three-dimensional structure of guanine nucleotide-binding regulatory protein-coupled receptors. Proc Natl Acad Sci USA 1992, 89: 4009–4012. 10.1073/pnas.89.9.4009
Yamamoto Y, Kamiya K, Terao S: Modeling of human thromboxane A2 receptor and analysis of the receptor-ligand interaction. J Med Chem 1993, 36: 820–825. 10.1021/jm00059a005
Manivet P, Schneider B, Smith J, Choi D, Maroteaux L, Kellermann O, Launay J: The serotonin binding site of human and murine 5-HT 2B receptors: molecular modeling and site-directed mutagenesis. J Biol Chem 2002, 277: 17170–17170. 10.1074/jbc.M200195200
Kask K, Berthold M, Kahl U, Nordvall G, Bartfai T: Delineation of the peptide binding site of the human galanin receptor. EMBO J 1996, 15: 236–244.
Bissantz C, Bernard P, Hibert M, Rognan D: Protein-based virtual screening of chemical databases. II. Are homology models of G-protein coupled receptors suitable targets? Proteins 2003, 50: 5–25. 10.1002/prot.10237
Latronico AC, Abell AN, Arnhold IJ, Liu X, Lins TS, Brito VN, Billerbeck AE, Segaloff DL, Mendonca BB: A unique constitutively activating mutation in third transmembrane helix of luteinizing hormone receptor causes sporadic male gonadotropin-independent precocious puberty. J Clin Endocrinol Metab 1998, 83: 2435–2440. 10.1210/jc.83.7.2435
Scheer A, Costa T, Fanelli F, De Benedetti PG, Mhaouty-Kodja S, Abuin L, Nenniger-Tosato M, Cotecchia S: Mutational analysis of the highly conserved arginine within the Glu/Asp-Arg-Tyr motif of the α 1b -adrenergic receptor: effects on receptor isomerization and activation. Mol Pharmacol 2000, 57: 219–231.
Lu X, Huang W, Worthington S, Drabik P, Osman R, Gershengorn MC: A model of inverse agonist action at thyrotropin-releasing hormone receptor type 1: role of a conserved tryptophan in helix 6. Mol Pharmacol 2004, 66: 1192–1200. 10.1124/mol.104.000349
Rubenstein LA, Zauhar RJ, Lanzara RG: Molecular dynamics of a biophysical model for β 2 -adrenergic and G protein-coupled receptor activation. J Mol Graph Model 2006, 25: 396–409. 10.1016/j.jmgm.2006.02.008
Fanelli F, De Benedetti PG: Computational modeling approaches to structure-function analysis of G protein-coupled receptors. Chem Rev 2005, 105: 3297–3351. 10.1021/cr000095n
Archer E, Maigret B, Escrieut C, Pradayrol L, Fourmy D: Rhodopsin crystal: new template yielding realistic models of G-protein-coupled receptors? Trends Pharmacol Sci 2003, 24: 36–40. 10.1016/S0165-6147(02)00009-3
Dastmalchi S, Morris MB, Church WB: Modeling of the structural features of integral-membrane proteins using reverse-environment prediction of integral membrane protein structure (REPIMPS). Protein Sci 2001, 10: 1529–1538. 10.1110/ps.6301
Luthy R, McLachlan AD, Eisenberg D: Secondary structure-based profiles: use of structure-conserving scoring tables in searching protein sequence databases for structural similarities. Proteins 1991, 10: 229–239. 10.1002/prot.340100307
Bowie JU, Luthy R, Eisenberg D: A method to identify protein sequences that fold into a known three-dimentional structure. Science 1991, 253: 164–170. 10.1126/science.1853201
Abagyan R, Totrov MM: Contact area difference (CAD): A robust measure to evaluate accuracy of protein models. J Mol Biol 1997, 268: 678–685. 10.1006/jmbi.1997.0994
Dastmalchi S, Kobus FJ, Iismaa TP, Morris MB, Church WB: Comparison of the transmembrane helices of bovine rhodopsin in the crystal structure and the Cαtemplate based on cryo-electron microscopy maps and sequence analysis of the G protein-coupled receptors. Molecular Simulation 2002, 28: 845–851. 10.1080/0892702021000002520
Eisenberg D, Weiss RM, Terwilliger TC, Wilcox W: Hydrophobic moments and protein structure. Faraday Symp Chem Soc 1982, 17: 109–120. 10.1039/fs9821700109
Gouldson PR, Snell CR, Bywater RP, Higgs C, Reynold CA: Domain swapping in G-protein coupled receptor dimers. Protein Eng 1998, 11: 1181–1193. 10.1093/protein/11.12.1181
Jordan BA, Devi LA: G-protein-coupled receptors heterodimerization modulates receptor function. Nature 1999, 399: 697–700. 10.1038/21441
Essen LO: Structural Genomics of "non-standard" proteins: a chance for membrane proteins? Gene Func Disease 2002, 3: 39–48. 10.1002/1438-826X(200210)3:1/2<39::AID-GNFD39>3.0.CO;2-6
Church WB, Jones KA, Kuiper DA, Shine J, Iismaa TP: Molecular modelling and site-directed mutagenesis of human GALR1 galanin receptor defines determinants of receptor subtype specificity. Protein Eng 2002, 15: 313–323. 10.1093/protein/15.4.313
Shacham S, Topf M, Avisar N, Glaser F, Marantz Y, Bar-Haim S, Noiman S, Naor Z, Becker OM: Modeling the 3D structure of GPCRs from sequence. Med Res Rev 2001, 21: 472–483. 10.1002/med.1019
Niv MY, Skrabanek L, Filizola M: Modeling activated states of GPCRs: the rhodopsin template. J Comput Aided Mol Des 2006, 20: 437–448. 10.1007/s10822-006-9061-3
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The protein data bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235
Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577–2637. 10.1002/bip.360221211
Luthy R, Bowie JU, Eisenberg D: Assessment of protein models with three-dimensional profiles. Nature 1992, 356: 83–85. 10.1038/356083a0
Ponder JW, Richards FM: Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 1987, 193: 775–791. 10.1016/0022-2836(87)90358-5
Eisenberg D, Weiss RM, Terwilliger TC: The helical hydrophobic moment: a measure of the amphiphilicity of a helix. Nature 1982, 299: 371–374. 10.1038/299371a0
Eisenberg D, Schwarz E, Komaromy M, Wall R: Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol 1984, 179: 125–142. 10.1016/0022-2836(84)90309-7
The authors would like to thank Dr J. Baldwin for providing the database of sequence alignments for 493 GPCRs and the Cα coordinates of rhodopsin. The financial support provided by the Research Office of Tabriz University of Medical Sciences is also appreciated.
This article has been published as part of BMC Bioinformatics Volume 9 Supplement 1, 2008: Asia Pacific Bioinformatics Network (APBioNet) Sixth International Conference on Bioinformatics (InCoB2007). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/9?issue=S1.
The authors declare that they have no competing interests.
SD was responsible for the original concept and design, performing calculations and drafting as well as refinements of the manuscript; WBC was responsible for the original concept of the calculations, as well as scripting and testing, and then refinements of the manuscript; MBM was involved in conceptual improvements of the calculations, manuscript refinement and experimentalist perspective on the biochemistry. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Sequences information for G protein-coupled receptors. This table contains the names and UniProtKB/Swiss-Prot ID tags of the GPCRs modelled in this study. (DOC 184 KB)
About this article
Cite this article
Dastmalchi, S., Church, W.B. & Morris, M.B. Modelling the structures of G protein-coupled receptors aided by three-dimensional validation. BMC Bioinformatics 9 (Suppl 1), S14 (2008). https://doi.org/10.1186/1471-2105-9-S1-S14
- Transmembrane Helix
- Transmembrane Segment
- Helix Axis
- Bovine Rhodopsin
- Helix Bundle