Prediction of indirect interactions in proteins

Prusis, Peteris; Uhlén, Staffan; Petrovska, Ramona; Lapinsh, Maris; Wikberg, Jarl ES

doi:10.1186/1471-2105-7-167

Research article
Open access
Published: 22 March 2006

Prediction of indirect interactions in proteins

Peteris Prusis^1,2,
Staffan Uhlén^1,3,
Ramona Petrovska¹,
Maris Lapinsh¹ &
…
Jarl ES Wikberg¹

BMC Bioinformatics volume 7, Article number: 167 (2006) Cite this article

5856 Accesses
34 Citations
Metrics details

Abstract

Background

Both direct and indirect interactions determine molecular recognition of ligands by proteins. Indirect interactions can be defined as effects on recognition controlled from distant sites in the proteins, e.g. by changes in protein conformation and mobility, whereas direct interactions occur in close proximity of the protein's amino acids and the ligand. Molecular recognition is traditionally studied using three-dimensional methods, but with such techniques it is difficult to predict the effects caused by mutational changes of amino acids located far away from the ligand-binding site. We recently developed an approach, proteochemometrics, to the study of molecular recognition that models the chemical effects involved in the recognition of ligands by proteins using statistical sampling and mathematical modelling.

Results

A proteochemometric model was built, based on a statistically designed protein library's (melanocortin receptors') interaction with three peptides and used to predict which amino acids and sequence fragments that are involved in direct and indirect ligand interactions. The model predictions were confirmed by directed mutagenesis. The predicted presumed direct interactions were in good agreement with previous three-dimensional studies of ligand recognition. However, in addition the model could also correctly predict the location of indirect effects on ligand recognition arising from distant sites in the receptors, something that three-dimensional modelling could not afford.

Conclusion

We demonstrate experimentally that proteochemometric modelling can be used with high accuracy to predict the site of origin of direct and indirect effects on ligand recognitions by proteins.

Background

The processes of life depend on intermolecular recognition. Molecular recognition by proteins is a complex process that is determined not only by direct interactions of a protein with the interacting molecule, but also by indirect effects arising at distant sites in the proteins. Using three-dimensional (3D) structure approaches it is not straightforward to analyze long distance effects on, for example, protein conformation, mobility, and stability. Recently, a new approach using statistical analysis of protein and ligand interaction space, proteochemometrics, was developed [1, 2]. It was used to model protein-peptide interactions and interactions of proteins with organic compounds [1, 3–7]. Here we show its utility in predicting indirect effects in proteins.

Proteochemometrics originates from chemometrics, the mathematical methods used to analyze chemical data. Proteochemometrics aims to describe the interactions between a series of macromolecules (such as proteins) and a series of ligands. Proteochemometric models are thereby created. These models are useful for predicting the affinities of new proteins for their ligands if the new molecules fall within the description space of the protein-ligand pairs of the training data set. A proteochemometric experiment is typically described by three descriptor blocks; the ligand descriptor (D^L), protein descriptor (D^P), and ligand-protein cross-term (D^LP) blocks. A vector of numbers, called the ligand descriptors (D^L), characterizes each ligand L. Similarly, each protein P has its protein descriptors (D^P). If we use a linear method of regression, the negative logarithm of ligand L's affinity (pK_LP) for the protein P is expressed by:

$p K_{L P} = \bar{p K} + \sum_{i}^{N} C_{i} D_{i}^{L} + \sum_{j}^{M} C_{j} D_{j}^{P} + \sum_{i}^{N} \sum_{j}^{M} C_{i j} D_{i j}^{L P} E q . 1$

where $\bar{p K}$ is the average affinity, C_i, C_j, and C_ijare the regression coefficients for ligand descriptors, protein descriptors, and ligand-protein cross-terms, respectively, and N and M are the number of descriptors for ligands and proteins, respectively.

Ligand-protein cross-terms are usually defined by a new vector that is formed by multiplying each ligand descriptor with each receptor descriptor of particular ligand-receptor pairs. Hence,

$D_{i j}^{L P} = D_{i}^{L} D_{j}^{P} E q . 2$

and then

$p K_{L P} = \bar{p K} + \sum_{i}^{N} C_{i} D_{i}^{L} + \sum_{j}^{M} C_{j} D_{j}^{P} + \sum_{i}^{N} D_{i}^{L} [\sum_{j}^{M} C_{i j} D_{j}^{P}] E q . 3$

By using Eq. 3, the selectivity, S_LAB, between protein A and protein B for some particular ligand L can be expressed as:

$S_{L A B} = p K_{L A} - p K_{L B} = \sum_{j}^{M} C_{j} (D_{j}^{A} - D_{j}^{B}) + \sum_{i}^{N} D_{i}^{L} [\sum_{j}^{M} C_{i j} (D_{j}^{A} - D_{j}^{B})] E q . 4$

If a region U of a protein is described by the set of descriptors, V, then the contribution to the selectivity, $S_{L A B}^{U}$ , by U between proteins A and B for ligand L is obtained by:

$S_{L A B}^{U} = \sum_{j \in V}^{K} C_{j} (D_{j}^{A} - D_{j}^{B}) + \sum_{i}^{N} D_{i}^{L} [\sum_{j \in V}^{K} C_{i j} (D_{j}^{A} - D_{j}^{B})] E q . 5$

Accordingly, we can localize regions in a protein that afford selectivity (i.e., functionality difference between protein pairs) for a particular ligand by applying Eq. 5. A region can be a subsequence, a 3D molecular interaction field, a single amino acid, or even a physicochemical property of an individual amino acid, and is only restricted by the way the descriptors are assigned to the proteins [1–7]. Since Eq. 5 places no restriction how far in space a ligand is located from a region in a protein, proteochemometrics is useful to predict indirect interactions in proteins.

Results

Design and testing of multiple chimeric melanocortin receptor library

It is necessary to have modifications of both the proteins and ligands, preferably both in the form of a series (i.e., "libraries"), in order to use Eq. 5. Here we used a library of multiple chimeric melanocortin receptors [7]. Briefly, the library was created from four natural melanocortin receptors (MC₁, MC_3–5). Each receptor was divided into five sequence fragments (S1–S5) and multiple chimeric receptors were then obtained by systematically shuffling the fragments. In order to maximize the chemical space information of the receptors, while keeping the number of experiments as low as possible, statistical molecular design was applied to properly select the sequence fragments [7–10]. The entire receptor library comprised 18 receptors and was tested for its interaction with the native melanocortin ligand, α-MSH and the synthetic MSH analogues NDP-MSH and [¹²⁵I]-NDP-MSH (see Table 1).

Table 1 Affinities of multiple chimeric melanocortin receptors. Affinities (pK ± standard deviation, SD) of multiple chimeric melanocortin receptors for α-MSH, NDP-MSH and [¹²⁵I]-NDP-MSH determined by radioligand binding.

Full size table

Proteochemometrics modelling

Interpretation of Eq.5 is dependent on our description of the proteins. Here two proteochemometrics models were created. One was based on a binary description of the proteins. The other used physicochemical descriptions of amino acids located inside transmembrane regions with presumed proximity to possible ligand binding cleft(s) according to the x-ray structure of bovine rhodopsin [7, 11, 12]. The binary model comprised information on the extent to which segments S1–S5 are involved in the selective recognition of ligands by the receptors, while the model based on physicochemical descriptions of amino acids comprised information on the contributions of single amino acids. Both models performed excellently in state-of-the-art model validations (see Table 2).

Table 2 Performance of proteochemometric models. Performance of proteochemometric models derived using wild-type and multiple chimeric receptors interacting with melanocortins, α-MSH, NDP-MSH and [¹²⁵]-NDP-MSH. Shown are results from the model based on binary receptor descriptors (A) and the model based on physicochemical description of amino acids of the transmembrane regions presumed to face in a direction opposite to the membrane (B) (see Methods for further details).

Full size table

Prediction of ligand recognition

The models were used to compute selectivity recognition maps for all melanocortin receptor pairs for the α-MSH hormone (see Tables 3 and 4). α-MSH is recognized by MC receptors, albeit it binds with more than 1000 times higher affinity to the MC₁ receptor than to the MC₄ receptor. The recognition map derived from the binary model predicted that segments S1, S2, and S4 play major roles for the α-MSH MC₁/MC₄ selectivity, while S3 and S5 contribute only a little (Fig. 1). The involvement of S1 and S2 was expected, as these regions possess amino acids close to the ligand recognition site according to 3D modelling [11]. However, the whole S4 region was located farther away, and its involvement was therefore more surprising. The model based on amino-acid physicochemical properties in the transmembrane regions indicated that three amino acids in S1, five in S2, one in S4, and one in S5 caused the MC₁/MC₄ receptor selectivity (see Table 4).

Table 3 Selectivity recognition map predicted from binary model. Selectivity recognition map for wild-type MC receptor pairs for α-MSH computed from the binary proteochemometric model. The contribution to the selectivity S^u, log(M), for indicated segments S1–S5, was computed from the model using Eq. 5. Total selectivity represents the entire differences in affinity between receptor pairs computed from the model (see Eq. 4).

Full size table

Table 4 Predicted amino acid selectivity recognition map. Amino acid selectivity recognition map for MC receptors for α-MSH computed from the proteochemometric model based on the physicochemical description of amino acids of the transmembrane regions presumed to facing in the direction opposite to the membrane. The contributions to selectivity S^u, log(M), of the indicated amino acid positions were computed from the model using Eq. 5. Total selectivity represents the entire difference in affinity between receptor pairs computed from the model (see Eq. 4). TM, transmembrane regions; SG, segments.

Full size table

Verification of predictions using mutagenesis

In order to assess the predictions for the ten amino acid positions predicted with by the model based on physicochemical properties of amino-acids located inside the receptors transmembrane regions, they were subjected to single, double, triple, pentuple, and heptuple mutations in the MC₄ receptor, replacing them with the corresponding amino acids in the MC₁ receptor. Measurements taken from the mutants showed that seven positions gave the expected increase in affinity for α-MSH (see Fig. 2, Table 5). Combining amino acid substitutions also gave higher increases in affinity than did the single point mutations (see Fig. 2, Table 5).

Table 5 Experimentally determined pK values of α-MSH for MC₄ receptor mutants. Shown is the average pK ± SD determined in radioligand binding competition using [¹²⁵I]NDP-MSH as radioligand. (The numbers of independent replicates are in brackets). The significance, p, is calculated using nonparametric Wilcoxon Rank Sum test [29] (see Methods for details), when compared with the wild-type MC₄ receptor.

Full size table

However, three of the ten positions (A89S, I102A and I251L) did not show the predicted increase in affinity (see Fig. 2, Table 5). The failure of the A89S and I102A mutations to increase the affinity of the MC₄ receptor can be explained by the presence of several amino acid positions in the protein library that show the same or similar variability (i.e., they co-vary). For example, positions A89 and A135 (numbered according to the MC₄ receptor) both reside in the same segment and are serines in the MC₁ receptor and alanines in the MC₃, MC₄, and MC₅ receptors. Such co-varying sequence positions gain equal importance in a proteochemometric model, even when some of them are not responsible for the explained activity. In the present library, mutations A89S, I102A, I129T, A135S, and L141G represent co-varying amino acid positions. The failure of the A89S and I102A mutations to cause the predicted increase in affinity may thus be explained on the basis of co-variance, where the actual effect is caused by mutations I129T, A135S, and L141G. In fact, the sum of the experimentally determined affinity increase by mutations A89S, I102A, I129T, A135S, and L141G (pKi = 0.69) closely agrees with that predicted from the model (S^U = 0.56).

However, the failure of mutation I251L to increase affinity could not be explained by co-variances of amino acids within the model. Accordingly the predicted effect must originate from amino acids actually co-varying in the library but excluded from the modelling in the selections of amino acids based on 3D structure. Three such possible excluded positions partially co-varying with I251 are found in the midst of TM6, three are found towards the intracellular face of the lipid bi-layer in TM6, and several are present in the third intracellular loop. In the 3D structure, these amino acids are obviously located very far from any eventual binding pocket for α-MSH. To verify the prediction that these positions are responsible for the 'missing' affinity they were mutated separately and in combinations (see Fig. 3, Table 5). Although all mutations afforded an increase in affinity, most of the increase (~12 -fold) was afforded by swapping the entire third intracellular loop (see Fig. 3, Table 5). (Mutating all of these positions together afforded an ~20-fold increase in affinity).

Discussion

The experimental data demonstrate the utility of proteochemometrics for mapping ligand recognition. Using it seven amino-acid positions and the third intracellular loop were identified as accounting for most of the difference in affinity of the MC₁ and MC₄ receptor for α-MSH. The ability to localize a distant effect in a protein distinguishes proteochemometrics from other approaches to mapping molecular recognition. A general scheme for mapping ligand recognition using proteochemometrics is outlined in Fig. 4. The protein library might initially be collected from wild-type proteins, or created by constructing multiple chimeric proteins, as performed in this study. Applying experimental design could, in both cases, substantially reduce the number of entities that need to be constructed and tested without detrimentally compromising the information gained [2]. The application of proteochemometric modelling to the data creates a selectivity recognition map for the proteins and ligands. Analyzing the map (e.g., using Eq. 5) reveals the regions in the proteins involved in recognition of the ligands in a data-set. Regions of high interest can then be identified and analyzed further by extending the library, if necessary, in order to remove ambiguities due to co-variances (Fig. 4). In the selection of the regions for mutations the present study shows that also distant effects on protein function should be taken into account. Overall the approach would lead to a substantial gain in information at reduced experimental effort

Mapping molecular recognition with proteochemometrics obtains different information than 3D-structure-based methods. The latter generally reveal where one particular ligand touches one particular receptor in one of many bound states. Proteochemometrics, by contrast, reveals regions of biomacromolecules that influence the selectivity of their ligand binding, and it has the capacity to predict effects arising from distant sites in a protein. The information obtained is only partially overlapping and, therefore, the approaches complement each other. The information gained in proteochemometrics relates to protein function and has direct utility for changing the function of a protein in a desired direction by mutation, in a priori protein engineering, or by altering the structure of a chemical entity and improving its interaction with binding sites, in a priori drug design [2].

Conclusion

In the present work we have presented a theoretical basis for the use of proteochemometrics to assess direct and indirect interactions in proteins, and we verified its utility to this end experimentally. We have also outlined a general strategy to afford cost effective mapping of molecular recognition using proteochemometrics and indicate important differences of proteochemometric mapping of ligand recognitions compared to traditional three-dimensional approaches. We propose that proteochemometrics can be used as a complement to classical 3D based methods to the study of ligand recognition, or even as the prime choice, in particular when three-dimensional protein structures are not available. Moreover, since proteochemometrics can be used to map both direct and indirect interactions, it may find a more general use in e.g. protein engineering and mapping protein function.

Methods

Receptor clones and multichimeric receptors

The coding sequences of the MC₁ and MC₅ receptors were cloned earlier [13, 14]. The coding sequences of the MC₃ and MC₄ receptor were gifts of Dr. Ira Gantz [15, 16]. Multichimeric MC_1,3–5 receptors were constructed so that each receptor was divided into three regions, with each region being replaced with a corresponding region from one of the four wild-type MC_1,3–5 receptors. Swapping all segments from the four receptors would have created a library with an extensive number of combinations, but this would have been impractical. We applied a statistical molecular design, which combines statistical approaches to maximize information gained from a minimal number of experiments. We used a D-optimal design, which could approximate all combinations of the melanocortin receptor regions making up 16 receptors, of which 4 were wild-type and 12 were multiple chimeric receptors. The multiple chimeric receptors were constructed in two sets, the F- and S-sets. The divisions were made at the end of the third and fifth transmembrane segments for the F-set (9 receptors of the 12 designed could be obtained), and at the beginning of the second and middle of the sixth transmembrane segments for the S-set (5 receptors were thus obtained, completing the library according to the design). Thus, the library came to include a total of 18 receptors: 14 that were multiple chimeric (in five segments) and four wild-type MC_1,3–5 receptors. We have previously reported a full account of the construction of this receptor library [7]. Moreover, full accounts of the theory and applicability of statistical molecular design have also been reported previously [8–10].

Site-directed mutagenesis

Selected individual amino acids in the MC₄ receptor were exchanged for the corresponding amino acids of the MC₁ receptor, using mutation-inducing primers and PCR [17]. Eight single mutants (I66V, A89S, I102A, I129T, A135S, L141G, I251L, and Y268I), a double mutant (Q43E/P48D), and two triple mutants (N240G/M241L/I245V, V253I/V255F/V256L), were created. Some of these mutants were then used as templates for constructing mutants with multiple mutations, yielding I66V/L141G, I129T/A135S, L141G/Y268I, Q43E/P48D/I66V, Q43E/P48D/I66V/I129T/A135S, and Q43E/P48D/I66V/I129T/A135S/L141G/Y268I. We also manufactured two chimeric receptors in which subsequences of the MC₄ receptor were replaced by the corresponding subsequences of the MC₁ receptor. First we constructed a receptor ("S4 chimeric receptor") in which the entire S4 segment in the MC4 receptor was replaced with that from the MC₁ receptor. Another receptor ("IL3 chimeric receptor") comprised the MC₄ receptor with intracellular loop 3 replaced with the corresponding loop of the MC₁ receptor.

Peptides and radioligand

Peptide ligands α-MSH (Ac-SYSMGHFRWGKPV-NH₂) and NDP-MSH ([Tyr², Nle⁴, D-Phe⁷]-α-MSH) were synthesized using standard solid phase peptide synthesis and purified by HPLC; their correct molecular weights were verified by mass spectrometry. [¹²⁵I]-NDP-MSH ([¹²⁵I-Tyr², Nle⁴, D-Phe⁷]-α-MSH) was prepared in radiochemically pure form by custom iodination at EuroDiagnostica AB, Sweden.

Receptor expression and radioligand binding

COS-1 cells were grown in Dulbecco's modified Eagle's medium with 10% fetal calf serum. Eighty-percent confluent cultures were transfected in 100-mm dishes with the DNA constructs (10 μg DNA per dish, mixed with liposomes, as described [18]. Twelve to sixteen hours after transfection, the serum-free medium was replaced with growth medium and the cells were cultivated for approximately 48 hours, then scraped off, centrifuged, and used for radioligand binding. Dissociation constants (K_d) for [¹²⁵I]-NDP-MSH were estimated for all the receptors by radioligand binding saturation as described [7, 18]. Inhibition constants (K_i) for α-MSH and NDP-MSH were then estimated in competition with [¹²⁵I]-NDP-MSH using established procedures [7, 18]. Obtained pK values (i.e., the negative logarithms of the K_i and K_d values) are listed in Tables 1 and 5.

Numerical descriptions of receptors for proteochemometric modelling

Two types of descriptions were used for the receptors, namely a binary description and a description based on physicochemical descriptions of amino acids. For the binary description, four binary numbers described each region, with each number corresponding to one receptor subtype. Each of these descriptors was assigned a value of +1 if the region was taken from descriptors' corresponding receptor subtype, otherwise it was assigned the number -1 [7].

The receptor descriptions based on physicochemical descriptions of amino acids at 38 selected sequence positions was used as described [7]. These positions were selected from a three-dimensional model of the transmembrane regions of the MC₁ receptor.

Using the crystal structure coordinates of rhodopsin as a template [12], the model was derived by replacing the side chains with the corresponding side chains of the MC₁ receptor, using the alignments of the GPCRDB database [19] and the SCWRL program [20]. Only amino acids pointing in the direction opposite the lipid bilayer were considered. This led to a considerable reduction in the number of co-varying sequence positions considered in the proteochemometric modelling, albeit with the obvious risk of excluding important positions. The selected positions are listed in Table 4.

Each position was coded by using five principal components derived from 26 physicochemical properties of amino acids, so called z-scales [21]. These z-scales represent hydrophobicity, steric properties, polarizability (z1–z3), polarity, and electronic effects (z4, z5). The data for the 38 × 5 = 190 descriptors obtained were compressed by applying principal component analysis (PCA) [22] to each of the five segments of the receptor library, which yielded in total 15 descriptors, with three descriptors for each segment. Prior to PCA, the z-scale descriptors had been scaled to unit variance [23]. Principal component analysis was performed using the Simca-P program [23].

Numerical description of peptides for proteochemometric modelling

The three peptide ligands used showed limited structural variation and were assigned two binary descriptors termed Y and F, using the Free-Wilson approach [24]. Y was set to +1 if the Tyr² residue of the peptides was iodinated; otherwise it was set to -1. Thus, ligand descriptor Y distinguished between [¹²⁵I]-NDP-MSH and the other two peptides. The descriptor F was set to +1 if Phe⁷ was in the D-conformation and there was an Nle at position 4; otherwise it was set to -1. Thus, ligand descriptor F distinguished between α-MSH and the other two ligands. We also included the cross-term formed between the peptide descriptors Y and F(termed Y*F). This cross-term distinguishes NDP-MSH from α-MSH and [¹²⁵I]-NDP-MSH. The peptide ligand description is summarized in Table 6.

Table 6 Description of peptides for proteochemometric modelling. The general sequence of the peptides used herein was: Ac-Ser-Pos2-Ser-Pos4-Gly-His-Pos7-Arg-Trp-Gly-Lys-Pro-Val-NH₂. Amino acids in Pos2, Pos4, Pos7, and the corresponding description, were as shown.

Full size table

Numerical description of binding experiments and proteochemometric modelling

In each binding experiment, the receptor-peptide combination was described using the above receptor and peptide descriptors, and by computing receptor-ligand cross-terms using Eq. 2, each cross-term being calculated as the product of one peptide and one receptor descriptor. Prior to calculating cross-terms, all descriptors were mean-centred and scaled to unit variance [23]. In order to account for differences in the number and mutual correlation of each descriptor type, the peptide descriptors, receptor descriptors, and cross-terms, block scaling was applied. All descriptors and cross-terms were mean centred and scaled to unit variance prior to block scaling [23]. Descriptors were finally correlated to the pK values using partial least squares projection to latent structures (PLS) regression [22]. PLS modelling was done using Simca-P [23].

Validation of modelling

The goodness-of-fit was estimated by the correlation coefficient R² and root mean square error (RMSE) [23]. Models were further validated using cross-validation (CV) [25, 26], validation by response permutations [27], and validation by external prediction.

In cross-validation, one divides the data into several fractions. Seven were used in this study. Each fraction is repeatedly excluded once and then predicted from the model developed on the remaining data. The goodness of prediction of the CV is assessed by the Q² measure [23].

In response permutation validation, many models are created using randomly permuted response data. Twenty permutations were used here. For each permuted model, the R², Q², and correlation coefficients between the original and permuted response values are estimated. The correlation coefficients are plotted against the R² and Q² values. The two corresponding linear correlation lines are estimated, one for R² and one for Q², and the intercepts iR² and iQ² of the two regression lines with the zero correlation coefficient line are calculated [5, 23]. These intercept values indicate the R² and Q² of random response data. For example, a negative Q² intercept shows that it is not possible to obtain predictive models with random data, and indicates that a high Q² value of the original model is not obtained by pure chance.

External prediction assesses a model's stability when a substantial fraction of the data is excluded (e.g., more than one-third). External prediction may aim to predict the properties of new entities, in other words, entities that are entirely excluded from the data set. In the present case we predicted pKs for the S-set receptors using only data for F-set receptors. The goodness of external prediction was assessed by the external Q² (eQ²) value [5]. Further details on these model validation approaches and how to interpret their results have been previously reported [28].

Computation of the selectivity contribution of amino acids

The contributions of individual amino acids to the selectivity of peptide binding between pairs of receptors were computed using Eq. 5. In order to apply Eq. 5, the regression coefficients for the individual z-scales of the receptor's amino acids were computed from the corresponding PLS and PCA models. Nine amino acids (Q43, P48, I66, A89, I102, I129, L141, I251, Y268, according to the numbering in the MC₄ receptor) contributed most to the selectivity of α-MSH (see Table 4) and were selected for site-directed mutagenesis. Moreover, upon analyzing the 3D model of the MC₁ receptor it was deemed that S83 (corresponding to A89 in the MC₄ receptor) had a strong H-bond interaction with S130 (A135 in the MC₄ receptor). Since the amino acid positions A89/A135 (MC₄ receptor numbering) showed identical co-variance in the receptor library, A135 was also selected for site-directed mutagenesis.

Statistical tests

The distribution of measured affinity for MC₄ receptor and mutant receptors presented here as well as its logarithm values (pK) did not correspond to normal distribution. Therefore we decided to use nonparametric Wilcoxon Rank Sum statistical test [29] to verify the hypothesis that affinity for corresponding mutant receptor differs from wild-type MC₄ receptor. The test was performed using R program [30].

References

Prusis P, Muceniece R, Andersson P, Post C, Lundstedt T, Wikberg JE: PLS modeling of chimeric MS04/MSH-peptide and MC1/MC3-receptor interactions reveals a novel method for the analysis of ligand-receptor interactions. Biochim Biophys Acta 2001, 1544(1–2):350–357.
Article CAS PubMed Google Scholar
Wikberg JES, Lapinsh M, Prusis P: Proteochemometrics: A tool for modelling the molecular interaction space. In Chemogenomics in Drug Discovery – A Medicinal Chemistry Perspective. Edited by: Kubinyi H, Müller G, Mannhold R, Folkers G. Weinheim: Wiley-VCH; 2004:289–309.
Google Scholar
Lapinsh M, Prusis P, Gutcaits A, Lundstedt T, Wikberg JE: Development of proteo-chemometrics: a novel technology for the analysis of drug-receptor interactions. Biochim Biophys Acta 2001, 1525(1–2):180–190.
Article CAS PubMed Google Scholar
Lapinsh M, Prusis P, Lundstedt T, Wikberg JE: Proteochemometrics modeling of the interaction of amine G-protein coupled receptors with a diverse set of ligands. Mol Pharmacol 2002, 61(6):1465–1475. 10.1124/mol.61.6.1465
Article CAS PubMed Google Scholar
Prusis P, Lundstedt T, Wikberg JE: Proteo-chemometrics analysis of MSH peptide binding to melanocortin receptors. Protein Eng 2002, 15(4):305–311. 10.1093/protein/15.4.305
Article CAS PubMed Google Scholar
Lapinsh M, Prusis P, Mutule I, Mutulis F, Wikberg JE: QSAR and proteo-chemometric analysis of the interaction of a series of organic compounds with melanocortin receptor subtypes. J Med Chem 2003, 46(13):2572–2579. 10.1021/jm020945m
Article CAS PubMed Google Scholar
Lapinsh M, Veiksina S, Uhlén S, Petrovska R, Mutule I, Mutulis F, Yahorava S, Prusis P, Wikberg JE: Proteochemometric mapping of the interaction of organic compounds with melanocortin receptor subtypes. Mol Pharmacol 2005, 67(1):50–59. 10.1124/mol.104.002857
Article CAS PubMed Google Scholar
Linusson A, Wold S, Nordén B: Statistical molecular design of peptoid libraries. Mol Divers 1998, 4(2):103–114. 10.1023/A:1026416430656
Article CAS PubMed Google Scholar
Lundstedt T, Seifert E, Abramo L, Thelin B, Nystrom A, Pettersen J, Bergman R: Experimental design and optimization. Chemometrics and Intelligent Laboratory Systems 1998, 42(1–2):3–40. 10.1016/S0169-7439(98)00065-3
Article CAS Google Scholar
Linusson A, Gottfries J, Lindgren F, Wold S: Statistical Molecular Design of Building Blocks for Combinatorial Chemistry. J Med Chem 2000, 43(7):1320–1328. 10.1021/jm991118x
Article CAS PubMed Google Scholar
Prusis P, Schiöth HB, Muceniece R, Herzyk P, Afshar M, Hubbard RE, Wikberg JE: Modeling of the three-dimensional structure of the human melanocortin 1 receptor, using an automated method and docking of a rigid cyclic melanocyte-stimulating hormone core peptide. J Mol Graph Model 1997, 15(5):307–17. 334 10.1016/S1093-3263(98)00004-7
Article CAS PubMed Google Scholar
Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M: Crystal structure of rhodopsin: A G protein-coupled receptor. Science 2000, 289(5480):739–745. 10.1126/science.289.5480.739
Article CAS PubMed Google Scholar
Chhajlani V, Wikberg JE: Molecular cloning and expression of the human melanocyte stimulating hormone receptor cDNA. FEBS Lett 1992, 309(3):417–420. 10.1016/0014-5793(92)80820-7
Article CAS PubMed Google Scholar
Chhajlani V, Muceniece R, Wikberg JE: Molecular cloning of a novel human melanocortin receptor. Biochem Biophys Res Commun 1993, 195(2):866–873. 10.1006/bbrc.1993.2125
Article CAS PubMed Google Scholar
Gantz I, Konda Y, Tashiro T, Shimoto Y, Miwa H, Munzert G, Watson SJ, DelValle J, Yamada T: Molecular cloning of a novel melanocortin receptor. J Biol Chem 1993, 268(11):8246–8250.
CAS PubMed Google Scholar
Gantz I, Miwa H, Konda Y, Shimoto Y, Tashiro T, Watson SJ, DelValle J, Yamada T: Molecular cloning, expression, and gene localization of a fourth melanocortin receptor. J Biol Chem 1993, 268(20):15174–15179.
CAS PubMed Google Scholar
Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR: Site-directed mutagenesis by overlap extension using the polymerase chain reaction. Gene 1989, 77(1):51–59. 10.1016/0378-1119(89)90358-2
Article CAS PubMed Google Scholar
Schioth HB, Muceniece R, Wikberg JE: Characterisation of the melanocortin 4 receptor by radioligand binding. Pharmacol Toxicol 1996, 79(3):161–165.
Article CAS PubMed Google Scholar
Horn F, Bettler E, Oliveira L, Campagne F, Cohen FE, Vriend G: GPCRDB information system for G protein-coupled receptors. Nucleic Acids Res 2003, 31(1):294–297. 10.1093/nar/gkg103
Article PubMed Central CAS PubMed Google Scholar
Bower MJ, Cohen FE, Dunbrack RL: Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. J Mol Biol 1997, 267(5):1268–1282. 10.1006/jmbi.1997.0926
Article CAS PubMed Google Scholar
Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S: New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem 1998, 41(14):2481–2491. 10.1021/jm9700575
Article CAS PubMed Google Scholar
Geladi P, Kowalski BR: Partial least-squares regression: A tutorial. Anal Chim Acta 1986, 185: 1–17. 10.1016/0003-2670(86)80028-9
Article CAS Google Scholar
SIMCA-P 9 User Guide and Tutorial. Umeå: Umetrics AB; 2001.
Free SM, Wilson JW: A mathematical contribution to structure-activity studies. J Med Chem 1964, 1: 395–399. 10.1021/jm00334a001
Article Google Scholar
Wold S: Cross-validatory estimation of the number of components in factor and principal component models. Technometrics 1978, 20: 397–405.
Article Google Scholar
Wakeling IN, Morris JJ: A test of significance for partial least squares regression. J Chemometr 1993, 7: 291–304. 10.1002/cem.1180070407
Article CAS Google Scholar
Efron B: Better bootstrap confidence intervals. J Am Stat Assoc 1987, 78: 171–200.
Article Google Scholar
Eriksson L, Johansson E, Wold S: Quantitative structure-activity relationship validation. In Quantitative structure-activity relationships in environmental sciences-VII. Edited by: Schuurmann G, Chen F. Pensacola: SETAC; 1997:381–397.
Google Scholar
Wilcoxon F: Individual Comparisons by Ranking Methods. Biometrics 1945, 1(6):80–83.
Article Google Scholar
R Development Core Team: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2004.
Google Scholar
Schiöth HB, Petersson S, Muceniece R, Szardenings M, Wikberg JE: Deletions of the N-terminal regions of the human melanocortin receptors. FEBS Lett 1997, 410(2–3):223–228. 10.1016/S0014-5793(97)00593-0
Article PubMed Google Scholar

Download references

Acknowledgements

Supported by the Swedish Research Council (04X-05957 and 621-2002-4711). We thank Santa Veiksina for technical assistance.

Author information

Authors and Affiliations

Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
Peteris Prusis, Staffan Uhlén, Ramona Petrovska, Maris Lapinsh & Jarl ES Wikberg
Linnaeus Center for Bioinformatics, Uppsala University, Uppsala, Sweden
Peteris Prusis
Section for Pharmacology, The University of Bergen, Bergen, Norway
Staffan Uhlén

Authors

Peteris Prusis
View author publications
You can also search for this author in PubMed Google Scholar
Staffan Uhlén
View author publications
You can also search for this author in PubMed Google Scholar
Ramona Petrovska
View author publications
You can also search for this author in PubMed Google Scholar
Maris Lapinsh
View author publications
You can also search for this author in PubMed Google Scholar
Jarl ES Wikberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jarl ES Wikberg.

Additional information

Authors' contributions

PP did most of modeling, data analysis and interpretation and manuscript preparation work. SU was responsible for all molecular biology work. RP assisted SU as well as performed pharmacological measurements. ML gave significant intellectual contribution for modeling, data analysis and interpretation. JESW was the major initiator of the project and supervisor of it and he contributed significantly to drafting the manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Prusis, P., Uhlén, S., Petrovska, R. et al. Prediction of indirect interactions in proteins. BMC Bioinformatics 7, 167 (2006). https://doi.org/10.1186/1471-2105-7-167

Download citation

Received: 09 December 2005
Accepted: 22 March 2006
Published: 22 March 2006
DOI: https://doi.org/10.1186/1471-2105-7-167

Prediction of indirect interactions in proteins

Abstract

Background

Results

Conclusion

Background

Results

Design and testing of multiple chimeric melanocortin receptor library

Proteochemometrics modelling

Prediction of ligand recognition

Verification of predictions using mutagenesis

Discussion

Conclusion

Methods

Receptor clones and multichimeric receptors

Site-directed mutagenesis

Peptides and radioligand

Receptor expression and radioligand binding

Numerical descriptions of receptors for proteochemometric modelling

Numerical description of peptides for proteochemometric modelling

Numerical description of binding experiments and proteochemometric modelling

Validation of modelling

Computation of the selectivity contribution of amino acids

Statistical tests

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Authors' contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us