Skip to main content

Predicting substitutions to modulate disorder and stability in coiled-coils

Abstract

Background

Coiled-coils are described as stable structural motifs, where two or more helices wind around each other. However, coiled-coils are associated with local mobility and intrinsic disorder. Intrinsically disordered regions in proteins are characterized by lack of stable secondary and tertiary structure under physiological conditions in vitro. They are increasingly recognized as important for protein function. However, characterizing their behaviour in solution and determining precisely the extent of disorder of a protein region remains challenging, both experimentally and computationally.

Results

In this work, we propose a computational framework to quantify the extent of disorder within a coiled-coil in solution and to help design substitutions modulating such disorder. Our method relies on the analysis of conformational ensembles generated by relatively short all-atom Molecular Dynamics (MD) simulations. We apply it to the phosphoprotein multimerisation domains (PMD) of Measles virus (MeV) and Nipah virus (NiV), both forming tetrameric left-handed coiled-coils. We show that our method can help quantify the extent of disorder of the C-terminus region of MeV and NiV PMDs from MD simulations of a few tens of nanoseconds, and without requiring an extensive exploration of the conformational space. Moreover, this study provided a conceptual framework for the rational design of substitutions aimed at modulating the stability of the coiled-coils. By assessing the impact of four substitutions known to destabilize coiled-coils, we derive a set of rules to control MeV PMD structural stability and cohesiveness. We therefore design two contrasting substitutions, one increasing the stability of the tetramer and the other increasing its flexibility.

Conclusions

Our method can be considered as a platform to reason about how to design substitutions aimed at regulating flexibility and stability.

Background

Coiled-coils are ubiquitous oligomerisation motifs in proteins, where two or more amphiphatic \(\alpha\)-helices intertwine together similarly to the strings of a rope. They account for approximately 5–10% of all protein-encoding sequences across all genomes [1]. The most common coiled-coils are left-handed and they feature a specific sequence motif called heptad repeat, consisting of seven residues abcdefg where a and d are hydrophobic [2]. The number of residues per turn in a regular \(\alpha\)-helix is 3.6. In the case of heptad repeats, it reduces to 3.5 residues per turn, which leads to the left-handing [3]. A few cases of naturally occurring right-handed coiled coils were also identified, characterized by 11-residue repeats, where the periodicity of residues per turn increases up to 3.67. In this work we study the phosphoprotein multimerisation domains (PMD) of Measles virus (MeV) and Nipah virus (NiV) [4, 5], which both form tetrameric left-handed coiled-coils, and we compare their dynamical behaviour to that of the right-handed tetrameric coiled-coil of the RhcC protein (Staphylothermus marinus) [6].

MeV is a negative single stranded, non-segmented RNA virus that belongs to the family of Paramyxoviridae within the Mononegavirales order. Like in all Mononegavirales members, the genome of MeV is encapsidated in a regular array made of monomers of nucleoprotein (N) [7]. The N:RNA complex is the template for both transcription and replication ensured by the viral polymerase complex. The polymerase complex consists of the RNA-dependent RNA polymerase (L) and the phosphoprotein (P). P is an essential and conserved component of all non-segmented negative-sense RNA viruses, including some major human pathogens (e.g., rabies virus, respiratory syncytial virus, Ebola virus, and Nipah virus) [8]. All P proteins are multimeric, but the role of this multimerization is still unclear, with recent studies having reported that it is dispensable in the case of vesicular stomatitis virus [9, 10], a member of the Rhabdoviridae family in the Mononegavirales order, while PMD is crucially required for MeV polymerase functions [11]. MeV P is a modular protein [12], comprising an intrinsically disordered N-terminus domain (PNT, res 1–230) [13], and a C-terminal region (PCT, res 231–507). PCT is composed of a disordered region (res 231–303), the P multimerization domain (PMD) (res 304–375), a disordered linker (res 377–458) and a globular region (res 459–507) known as the X domain (XD) [14]. In the wild-type form of MeV PMD, residues in a and d positions are always leucine (L), isoleucine (I) and valine (V), except for N329 and Q356 (Fig. 1a). The side chains of residues at these positions (i.e. a and d) point toward the interior of the coiled-coil and adopt a “knob into holes” packing, where a residue from one helix is accommodated into a hydrophobic pocket composed by the same residue from the three other chains (Fig. 1b). The stability of coiled-coils is closely related to the geometry of knobs into holes. In MeV PMD, the regularity in the repetition of the abcdefg motif of the coiled-coil is interrupted by the insertion of a “stammer”, i.e. a three amino acids motif (\(\hbox {L}_{{339}}\hbox {L}_{{340}}\hbox {L}_{{341}}\)) between two heptads. This stammer results in the formation of a short 3_10 helix that causes the coiled-coil to kink and exposes K343 (e position) to the solvent. Using truncation variants and biochemical assays, the C-terminal region of MeV PMD was found to be crucial for the chaperon activity of P towards the polymerase [11]. Those studies also showed that MeV gene expression relies on the cohesiveness of PMD coiled-coil, i.e. substitutions of hydrophobic a and d positions that modulate the stability of the PMD were found to affect viral gene expression [11].

The body of structural data available for MeV PMD are conflicting. The crystal structure of this domain has been solved by two groups. In one of these crystal structures, the full-length domain (residues 304–375) is well-ordered [15] (PDB code: 3ZDO), while in two others the C-terminal region (residues 358–375) is missing, consistent with structural disorder [4] (PDB code: 4C5Q and 4BHV). The details of the crystal structures are described in Table 1. Structural comparison among the three available MeV PMD structures (4C5Q, 4BHV and 3ZDO) revealed crucial differences regarding the kink and the geometry of knobs [4]: (1) the kink occurs at L342 in all chains of 4C5Q and 4BHV, whereas it is missing in chains C and F of 3ZDO, (2) the association of the tetramer is less tight and helices are significantly less twisted in 4BHV compared to the other two structures, (3) the structure of 3ZDO represents fewer knobs on each protomer and (4) all tetramers are energetically stable and the crystal structure of 4BHV has the highest stability. These findings illustrate how the same protein sequence may lead to different coiled-coil structures (i.e. different content of disorder and different packing of the tetramer). Where do these differences come from? Are they due to different crystal packings, or to different conformations that PMD can adopt that might reflect different states within the L-P complex? Such differences can have an impact on the function and mechanism of action of this pivotal protein. It is therefore important to attempt at making sense of the discrepancies between the different structural data sets.

NiV is a newly emerged severe human pathogen in the family of Paramyxoviridae [16] and no vaccine or antiviral therapeutics has been developed yet for human use [17]. The N-terminal region of NiV P protein contains an extremely large intrinsically disordered region (residues 1–474) [18, 19]. Like the MeV P protein, the C-terminus region of NiV P encompasses a well-ordered multimerization domain (PMD), that spans residues 470–578, a flexible linker and a folded X domain (residues 660–709) [20]. The crystal structure of NiV PMD was solved as a long parallel tetrameric coiled-coil by two independent groups [5, 19]. In one of these crystal structures the coiled-coil region (residues 508–578) defined in the electron density is longer [5] (PDB code: 4N5B), while in the other the C-terminal region (residues 571–578) is missing [19] (PDB code: 4GJW), consistent with a disordered state like for MeV PMD. The details of the crystal structures are described in Table 1. In striking contrast with MeV PMD, in NiV PMD the N-terminal region of each monomer forms a two-helix cap. Like for MeV PMD, the regularity of the heptad repeat of the coiled-coil is interrupted by a stammer, i.e. the insertion of the \(\hbox {I}_{{543}}\hbox {P}_{{544}}\hbox {K}_{{545}}\) motif, and a kink occurs on residues \(\hbox {V}_{{540}}\hbox {K}_{{541}}\hbox {E}_{{542}}\) (Fig. 1c).

Molecular dynamics (MD) simulations provide a powerful way to describe the dynamical behaviour of proteins in solution. However, the size of a protein’s conformational space grows exponentially with the number of atoms and its full exploration remains prohibitive, especially when the protein is intrinsically disordered [21]. We have previously developed a method, COMmunication MApping (COMMA) that facilitates the description of protein dynamical architecture starting from conformational ensembles generated by relatively short MD simulations [22]. A more recent version of the tool (COMMA2) further allows predicting substitution effects at large scale and identify positions highly sensitive to substitutions [23]. In the present work, we used COMMA2 to resolve the ambiguities in the crystal structure data, and to assess the effects of several substitutions on the structural stability of the tetramers. First, we show that COMMA2 is useful to detect protein regions prone to disorder from short MD simulations of a few tens of nanoseconds, without requiring the protein to actually sample disordered states. Second, we demonstrate that substituting a set of a positions with negatively charged residues modulates the structural cohesiveness of the MeV PMD tetramer in a position-specific manner. These results provide an asset to rationally design two new substitutions modulating the stability of the coiled-coil. Specifically, targeting the kink increases its flexibility while targeting the C-terminus increases its stability.

Fig. 1
figure1

Sequence of registers of MeV PMD. The sequence of a MeV PMD and c NiV PMD along with the registers assigned to it are listed. Arrows point to the kink, the a positions of MeV PMD that have been substituted in this study, and the MeV PMD exception at position 329. b The orientation of side chains are shown on the structure of the MeV PMD tetramer, for the hydrophobic (a and d registers) and charged (e and g registers) amino acids. Colors represent physico-chemical properties of amino acids: orange for hydrophobic, red for negatively charged and blue for positively charged

Fig. 2
figure2

Communication Blocks of MeV PMD, NiV PMD and RhcC identified by COMMA2. \(CBs^{path}\) and \(CBs^{clique}\) identified by COMMA2 are mapped on the structure of ac MeV PMD (PDB code: 3ZDO), df NiV PMD (PDB code: 4N5B) and gi RhcC (PDB code: 1YBK). Left a, d, g Distinct \(CBs^{path}\) of at least 4 residues are colored in shades of red. Middle b, e, h long-range \(CBs^{path}\) are depicted in shades of purple, containing at least 7 residues (for NiV PMD) or 8 residues (for MeV PMD and RhcC). Right c, f, i distinct \(CBs^{clique}\) are colored in blue tones. Residues known to be ambiguous in MeV PMD are shown by sticks

Fig. 3
figure3

Comparison between different methods predicting disordered residues. Comparison between different methods predicting disorder residues for a MeV PMD and b NiV PMD. The unfolding of helices along MD simulations, the propensity of residues detected as \(CB^{clique}\) and the confidence of \(CB^{clique}\) residues are shown in red, blue and cyan, respectively. The predicted propensity of the Coils server to form coiled-coil structures is shown in green. The predictions of disordered residues obtained from IUPred, DISOPRED and Rosetta ResidueDisorder are shown in orange, purple and magenta, respectively

Table 1 Structural details of the crystal structures of MeV and NiV PMDs

Methods

Proteins studied

We studied three homo-tetrameric coiled-coils whose structures were resolved by X-ray crystallography: (1) MeV PMD (PDB id: 3ZDO, residues 308–373, 2.07 Å resolution), (2) NiV PMD (PDB id: 4N5B, residues 475–578, 2.2 Å resolution) and RhcC (PDB id: 1YBK, residues 1–52, 1.45 Å resolution). Among the available crystal structures for MeV PMD and NiV PMD, we chose 3ZDO and 4N5B, respectively, since they both have the largest number of residues defined in the electron density and with a C-terminus that is resolved. The right-handed coiled-coil homo-tetramer of the RhcC protein was taken as a control for our analysis on left-handed coiled-coils. Furthermore, we studied 6 different substitutions of MeV PMD: V315D, L322D, L336D, V346D, I353D, and E364F, hence in total 9 systems.

Molecular dynamics simulations

Set up of the systems. The 3D coordinates for the studied proteins were retrieved from the Protein Data Bank (PDB) [24]. All crystallographic water molecules and other non-protein molecules were removed. All models were prepared using the LEAP module of AMBER 12 [25], with the ff12SB forcefield parameter set: (1) hydrogen atoms were added, (2) \(\hbox {Na}^+\) or \(\hbox {Cl}^-\) counter-ions were added to neutralise the system charge, (3) the solute was hydrated with a cuboid box of explicit TIP3P water molecules with a buffering distance up to 10 Å. The environment of the histidines was manually checked and they were consequently protonated with a hydrogen at the \(\epsilon\) nitrogen. The variants of MeV PMD were generated by in silico substitutions starting from the 3ZDO structure using Rosetta Backrub [26].

Minimisation, heating and equilibration. The systems were minimised, thermalised and equilibrated using the SANDER module of AMBER 12. The following minimisation procedure was applied: (1) 10,000 steps of minimisation of the water molecules keeping protein atoms fixed, (2) 10,000 steps of minimisation keeping only protein backbone fixed to allow protein side chains to relax, (3) 10,000 steps of minimisation without any constraint on the system. Heating of the system to the target temperature of 310 K was performed at constant volume using the Berendsen thermostat [27] and while restraining the solute \(C_{\alpha }\) atoms with a force constant of 10 kcal/mol/\(\hbox {AA}^2\). Thereafter, the system was equilibrated for 100 ps at constant volume (NVT) and for further 100 ps using a Langevin piston (NPT) [28] to maintain the pressure. Finally the restraints were removed and the system was equilibrated for a final 100-ps run.

Production of the trajectories. For every protein, 2 replicates of 50 ns, with different initial velocities, were performed in the NPT ensemble using the PMEMD module of AMBER 12. The temperature was kept at 310 K and pressure at 1 bar using the Langevin piston coupling algorithm. The SHAKE algorithm was used to freeze bonds involving hydrogen atoms, allowing for an integration time step of 2.0 fs. The Particle Mesh Ewald method (PME) [29] was employed to treat long-range electrostatics. The coordinates of the system were written every ps. Standard analyses of the MD trajectories were performed with the ptraj module of AMBER 12.

Stability of the trajectories. The root mean square deviations (RMSD) of the studied coiled-coils (MeV PMD wild-type and variants, NiV PMD and RhcC) were measured along simulation time for all the replicates (Additional file 1: Figures S1 and S2). All systems were fully relaxed after 10 ns. Consequently, the last 40 ns of each replicate were retained for subsequent analyses. Moreover, we calculated the residue root mean square fluctuations (RMSF) over the last 40 ns of every simulation (Additional file 1: Figure S3).

COMMA2 analysis

For every studied system, COMMA2 was applied to the last 40 ns of the two replicates of MD simulations, and communication blocks were extracted. COMMA2 identifies pathway-based communication blocks (\(CBs^{path}\)), i.e. groups of residues that move together, and are linked by non-covalent interactions, and clique-based communication blocks (\(CBs^{clique}\)), i.e. groups of residues that display high concerted atomic fluctuations, and that are close in 3D space (see [22] for formal definitions and detailed descriptions). It should be emphasized that all the backbone-backbone non-covalent interactions are ignored for this analysis. We defined two sets of \(CBs^{path}\), namely short-range and long-range blocks [22]. Short-range blocks consist of pathways of at least 4 residues. Long-range blocks are system-dependent, i.e. they consist of pathways with at least 7 or 8 residues. COMMA2 detects pairs of stable secondary structure elements directly linked by communication pathways. Moreover, based on a previous study [30] we modified the definition of thresholds, and implemented new metrics and functionalities that allowed us to identify disordered regions, as explained below.

Identification of communication pathways. Communication pathways are trails of communicating residues that are not adjacent in the sequence, and form stable non-covalent interactions (hydrogen-bonds or hydrophobic contacts), and communicate efficiently. Communication efficiency or propensity is expressed as [22]:

$$\begin{aligned} CP(i,j)= \langle(d_{ij}-\bar{d_{ij}})^2\rangle \end{aligned}$$
(1)

where \({d_{ij}}\) is the distance between the C\(\alpha\) atoms of residues i and j and \(\bar{d_{ij}}\) is the mean value computed over the set of conformations. Two residues i and j are considered to communicate efficiently if CP(ij) is below a communication propensity threshold, \(\hbox {CP}_{{cut}}\). The strategy employed to set the value of \(CP_{cut}\), is explained in [22]. Intuitively, it is expected that neighbouring residues in the sequence, that form well-defined secondary structures, communicate efficiently with each other. Therefore, we evaluate the proportion \(p_{ss}\) of residues that are in an \(\alpha\)-helix, a \(\beta\)-sheet or a turn in more than half of the conformations. Then for every residue i surrounded by 8 sequence neighbours (4 before and 4 after), we compute a modified communication propensity MCP(i) as:

$$\begin{aligned} MCP(i) = \displaystyle \frac{1}{8} \sum _{\begin{array}{c} j=i-4\\ j \ne i;1 \le j \le R; 5 \le i \le (R-4) \end{array}}^{i+4} CP(i,j) \end{aligned}$$
(2)

where R is the total number of residues in each chain. However, the algorithm is modified in this work by considering the definition of polypeptide chains, i.e. the MCP of residue i is calculated over its neighbors (j) that are along the same polypeptide chain as i. In order to define the \(CP_{cut}\), we considered the proportion of secondary structures (\(p_{ss}\)) as a percentile over the sorted distribution of all calculated MCP values. Whenever more than one replicate of MD trajectories are available, we measured the \(CP_{cut}\) for each replicate and considered the average value for the identification of pathways. In addition, for the variants of MeV, we used the same \(CP_{cut}\) as the wild type to better contrast their behaviours.

Residue confidence scores. Communication pathways and independent cliques are defined by setting several parameters [22]. To assess the robustness of the results obtained with the default parameter values, COMMA2 systematically explores the parameter space around those values and assigns confidence scores to every residue in a \(CB^{path}\) or \(CB^{clique}\) [23]. The default procedure is to vary the thresholds from their default value up to the value where all residues of the protein are in the same block.

Prediction of disorder using COMMA2. A \(CB^{clique}\) is considered as predictor of disorder, if the residues forming that block encompass several chains. To reflect this behaviour, we propose two scores: residue disorder propensity (RDP) and disorder clique score (DCS). RDP highlights the propensity of each residue j for belonging to \(CBs^{clique}\) encompassing several chains, and is measured as:

$$\begin{aligned} RDP_j = \frac{\sum _{1 \le k \le M}{(\#chains_k^j) ^ 2}}{N^2} \end{aligned}$$
(3)

where N is the total number of chains, M is the number of \(CBs^{clique}\) and \(\#chains_k^j\) is the number of chains in \(CB^{clique}_k\) at position j. The RDP ranges between 0 (no clique detected) and 1 (only one clique encompassing all chains). Considering different RDP threshold allows defining different levels of disorder. For example, an RDP above 0.25 guarantees that one \(CB^{clique}\) encompassing at least half of the chains has been detected. A stricter threshold of 0.5 ensures both an overall high coverage of the chains and a high cooperation between chains. For example, the value of 0.5 will be attained if one \(CB^{clique}\) encompassing at least \(N/\sqrt{2}\) chains is detected, or if two \(CBs^{clique}\), both encompassing N/2 chains are detected. Hence, we consider residues with \(RDP \ge 0.5\) predicted as disordered.

DCS accounts for the disorder propensity of each \(CB^{clique}_k\), and is calculated as:

$$\begin{aligned} DCS_{CB^{clique}_k} = \frac{\sum _{1 \le h \le N}{(\#Res_k^h)}}{4 * max(\#Res_k)} \end{aligned}$$
(4)

where N is the total number of chains and \(\#Res_k^h\) is the number of residues in \(CB^{clique}_k\) and in chain h. If the numbers of residues from each chain are very different, the score is very small, and if the same number of residues from each chain are present in a \(CB^{clique}\), the score equals to one, highlighting the disorder region.

Comparison with other approaches

We compared the performance of our method with four other existing approaches to predict either coiled-coils or disordered regions in proteins, namely Coils server [31], IUPred [32], DISOPRED [33] and Rosetta ResidueDisorder [34]. The Coils server measures the probability of a sequence to adopt a coiled-coil conformations. It takes a sequence as input and compares it with a database of known parallel two-stranded coiled-coils and measures a score based on the similarity. Consequently, it compares the score with the distribution of scores for globular and coiled-coils proteins to obtain the probability of forming coiled-coil conformation. IUPred predicts intrinsically disordered regions based on estimation of the pairwise energy content. Amino acids in stable proteins are able to form a large number of favorable interactions, while intrinsically disordered proteins lack a stable 3D structure. DISOPRED predicts disordered regions based on disorder data from high resolution X-ray crystal structures, i.e. residues missing in the electron density map, but present in the sequence record. DISOPRED exploits support vector machine (SVM) learning techniques to train a neural network classifier and estimates the probability of residues being disordered. Finally, Rosetta ResidueDisorder uses Rosetta to predict the structure of a protein and reports the Rosetta per-residue energy function scores. Intrinsically disordered regions have higher score than ordered regions.

Fig. 4
figure4

Substitution of a hydrophobic amino acid with a negatively charged one. The hydrophobic a positions (a) are substituted to negatively charged residues (b). Corresponding side chains are shown as spheres

Fig. 5
figure5

Inter-residue communication in V315D and L322D variants of MeV PMD. Communication blocks identified by COMMA2 are mapped on the average MD conformation for the wild type (a) and two variants bearing substitutions in the first half of the helices, V315D (b) and L322D (c). \(CBs^{path}\) are obtained by considering interactions involving side chains and pathways with length equal or greater than 4 and 8 residues. Residues known to be disordered in a context-dependent manner are shown as sticks. The \(CBs^{clique}\) identified by COMMA2 are colored in blue tones. d The disorder clique score (DCS, see “Methods” section) is reported for every CB clique of the wild type and variants of MeV PMD. e All direct communications between helices are shown on the structures. f The schematic representation, reports the number of direct pathways between helices in the first and second halves (numbers colored in black and red, respectively)

Fig. 6
figure6

Inter-residue communication in V346D and I353D variants of MeV PMD. Communication blocks identified by COMMA2 are mapped on the average MD conformation for the wild type (a) and for variants bearing substitutions in the second half of the helices, I353D (b) and V346D (c). Those pathways are obtained by considering interactions involving side chains and pathways with length equal or greater than 4 and 8 residues. Residues known to be disordered in a context-dependent manner are shown by sticks. The \(CBs^{clique}\) identified by COMMA2 are colored in blue tones. d The disorder clique score (DCS) is reported for every CB clique of the wild type and variants of MeV PMD. e All direct communications between helices are shown on the structures. f The schematic representation, reports the number of direct pathways between helices in the first and second halves (numbers colored in black and red, respectively)

Fig. 7
figure7

Inter-residue communications in L336D and E364F variants of MeV PMD communication. Communication blocks identified by COMMA2 for a L336D and b E364F MeV PMD variants. \(CBs^{path}\) are obtained by considering interactions involving side chains and pathways with length equal or greater than 4 and 8 residues. The \(CBs^{clique}\) identified by COMMA2 are colored in blue tones. c The disorder clique score (DCS) is reported for every CB clique. d All direct communications between helices are shown on the structures. e The schematic representation, reports the number of direct pathways between helices in the first and second halves (numbers colored in black and red, respectively)

Table 2 Summary of the number of residues involved in CBspath and CBsclique and in direct pathways

Results and discussion

For each system of MeV PMD, NiV PMD and RhcC, we performed two replicates of 50 ns MD simulations, and applied COMMA2 to extract communication blocks. COMMA2 allows to contrast the different types of communication occurring between residues and to hierarchize the different regions of a protein depending on their communication efficiency. To this end, COMMA2 identifies pathway-based communication blocks (\(CBs^{path}\)) and clique-based communication blocks (\(CBs^{clique}\)). The residues comprised in a \(CB^{path}\) are linked by communication pathways by transitivity. A communication pathway is defined as a trail of communicating residues displaying correlated motions and linked by stable non-covalent interactions (see “Methods” section and [22] for more details). Hence, it represents an efficient route of information transmission supported by physical interactions. By contrast, \(CBs^{clique}\) are the most flexible regions of the protein displaying highly concerted atomic fluctuations. The RhcC protein was chosen as a control for our analysis on left-handed coiled-coils, for two reasons: (1) the right-handed nature of this homo-tetrameric coiled-coil and (2) the lack of kink in the middle of the structure.

Identical chains differentially contribute to the stability of MeV and NiV PMD tetramers

COMMA2 analysis revealed that, although the sequence of amino acids for each studied tetramer, is identical between its four chains, the behaviour of the chains within MeV and NiV PMDs is different. In the case of MeV PMD, we observe two \(CBs^{path}\) (Fig. 2a), of which one (in red) contains most of the residues from the four chains. A similar behaviour is observed for NiV PMD (Fig. 2d). By contrast, for RhcC only one \(CB^{path}\) is observed, suggesting a symmetrical/equivalent role for the different chains (Fig. 2g). The hierarchy of communication pathways between the helices in MeV and NiV PMD, can be further refined by considering only long pathways (Fig. 2b, e). We can clearly see that long-range communication occurs only in the N-terminal half of the tetramers. In contrast, the long-range \(CB^{path}\) of RhcC encompasses residues from the 4 chains and both halves (71% of the total number of residues in the protein).

Regarding \(CBs^{clique}\), they are detected in three locations in MeV and NiV PMDs: at the N-term, around the kink (kink-region) and at the C-term. We observed a striking contrast between the N- and C-termini. While single-chain \(\hbox {CBs}^{clique}\) are detected in the former, only one \(CB^{clique}\) encompassing residues from all four helices is detected in the latter (Fig. 2c, f, in dark blue). In the case of MeV PMD, 65% of the residues known to be ambiguous, i.e. missing in one crystal structure (shown as sticks on Fig. 2c) are included in this block. Hence, the detection of this \(CB^{clique}\), indicative of concerted high fluctuations across the four chains, can be interpreted as an indicator of disorder. We can thus hypothesize that residues of the corresponding block in NiV PMD are, to some extent, disordered. These results are in agreement with the existence of an ambiguous region with higher flexibility in the C-terminal end of NiV, that was discussed in a recent study [5]. All those structural ambiguous residues are included in the C-term \(CB^{clique}\) of NiV PMD. While COMMA2 results for the right-handed RhcC suggest a similar arrangement of \(CBs^{clique}\) in the N-term, a strongly different behaviour was observed in the C-term, with respect to the left-handed coiled-coils (Fig. 2i). Specifically, the \(CB^{clique}\) at the C-terminus contains only one residue from chain A. This observation suggests the lack of a disordered region in the structure of RhcC. The observed contrast between MeV and NiV PMDs and our negative control RhcC, highlights the impact of the kink in the middle of the left-handed tetramers.

To investigate whether differences could also be detected in a monomeric context, we performed MD simulations on single chains extracted from the tetramer of MeV PMD, NiV PMD and RhcC, and applied COMMA2 on each system (details are reported in Additional file 1). We should specifically stress that we did not intend here to realistically simulate the behaviour of a single helix in solution. As expected, the single helices were very flexible in the simulations, reflecting their instability. Nevertheless, we observed some differences suggesting that MeV and NiV PMD “monomers” were more prone to unfolding at their C-terminus than RhcC “monomer” (Additional file 1: Figure S4).

COMMA2 as a predictor of disorder and comparison with other tools

The term “disordered” designates regions or proteins that lack stable secondary and tertiary structure. In the case of a crystal structure they typically correspond to regions of missing electron density. Of course, disorder can be context-dependent and its extent can vary from one crystal structure to another (as is the case in this study). By contrast, unstable or flexible regions are characterized by high B factor (reflecting pronounced thermal motions) and hence they fluctuate around an equilibrium position. As such, these regions can be modeled in the electron density although the latter can be (and typically are) poorly defined. Here, based on the previous results, we propose a formal criterion to predict regions prone to be disordered in otherwise folded coiled-coils. We define a residue disorder propensity (RDP) index reflecting the propensity of each residue to be part of a \(CB^{clique}\) encompassing several chains (Fig. 3, cf regions in blue with the experimentally detected ambiguous regions, in black). The \(RDP \ge 0.5\) indicates disorder, while \(RDP < 0.5\) indicates flexibility. In order to evaluate the power of COMMA2 to predict disordered region in coiled-coil structures, we compared our results with random coil propensities estimated by DSSP [35] (in red) and with predictions from three sequence-based (Coils server, IUPred, and DISOPRED) and one structure-based software (Rosetta ResidueDisorder). The Coils server measures the probability to form a coiled-coil structure. Here we chose to present the probability not to form a coiled-coil structure (which is \(1-Prob(CoilsServer)\)), in order to better compare the results (in green). IUPred and DISOPRED both predict the probability of residues to be disordered (orange and purple bars, respectively). They were both trained on examples coming from the PDB. Rosetta ResidueDisorder predicts ordered (in white) and disordered residues (in magenta).

In the case of MeV PMD, the comparison of the predictions with the experimental data, reveals the power of COMMA2 to predict the disordered residues with strikingly better accuracy compared to Coils server, IUPred and DISOPRED (an accuracy of 62% compared to 15%, 23% and 0%, respectively). Rosetta ResidueDisorder displays the highest sensitivity, but it also detects other regions of the coiled-coil as disordered, for instance residues at the N-terminus or in the kink-region. In the case of NiV PMD, the accuracy of COMMA2, Coils server and Rosetta ResidueDisorder are all 100%, while IUPred and DISOPRED both fail to predict the C-terminus residues as being prone to disorder (Fig. 3b).

While the sequence-based methods (Coils and IUPred) and DISOPRED are good and fast measures to predict coiled-coil (Coils server) and disorder (IUPred and DISOPRED) propensities, they are not able to detect unstable regions of coiled-coils. On the other hand, Rosetta ResidueDisorder has high sensitivity but low positive predictive value (PPV) to predict disordered residues, while being computationally slower compared to the other three methods. Although COMMA2 is computationally more expensive as it relies on the results of MD simulations, it is able to predict both flexible and unstable regions. Interestingly, COMMA2 also provides a mean to distinguish the two different behaviours: while the presence of a single \(CB^{clique}\) over the four chains reflects the propensity to local disorder, the existence of \(CBs^{clique}\) provides hints on flexible regions playing a role in coiled-coils dynamics.

Controlling MeV PMD flexibility/communication through substitutions

The main purpose of this study is to design some substitutions modulating the stability of the structure of MeV PMD. The a positions on the sequence of MeV PMD are particularly attractive for this, as they bear hydrophobic side chains oriented toward the interior of the coiled-coils (inward). Consequently, substituting those positions with a negatively charged amino acid is expected to bring some instability by disrupting the “knobs into holes” organization that relies on hydrophobic interactions (Fig. 4). The analysis of systematic substitution of a coiled coil has shown that similar substitutions of hydrophobic a positions with negatively charged amino acids, destabilize the structure of coiled-coils [36], a feature that can be associated with function (see [37] for example). However, the structural effects that such substitutions would induce are not fully understood. Therefore, COMMA2 analysis can be employed to highlight the details of interaction networks and communications across the structure, unveiling structural changes induced by different sets of substitutions. From the analysis of the wild-type PMD structure, one may infer asymmetric dynamics of the protein, i.e. the first half of the structure is more rigid and represents the communication core of the system, whereas the second half is more flexible and contains disordered residues. Here, we investigated how substitutions creating local instability could influence the communication efficiency throughout the tetramer and in particular between the two halves. Specifically, we systematically substituted the hydrophobic amino acid at each a position (Fig. 4a) with the negatively charged amino acid D and thus investigated the impact of the following substitutions: V315D and L322D that fall in the first half of the helices, and V346D and I353D in the second half of the helices (Fig. 4b). Upon replacement of the wild-type residue with an aspartic residue, two replicates of 50 ns MD simulations were performed for every variant. COMMA2 analysis was applied to the ensemble of conformations obtained for each variant. The results suggest that substitutions in the first half (V315D and L322D), establish/reinforce communication between the two halves and the roles of the two halves in the communication of the complex becomes almost symmetrical/equivalent. In contrast, substitutions in the second half (V346D and I353D), result in the breakage of the communication between the two halves. Consequently, the first half becomes more rigid, while the flexibility of the second half increases. In the following the detailed results are reported.

Substitutions in the first half, before the kink

The V315D and L322D substitutions are located in the N-terminal extremity, positioned in the first and second a register, respectively. We applied COMMA2 analysis to extract the set of \(CBs^{path}\) and \(CBs^{clique}\), to measure the direct communications between pairs of secondary structure elements (see “Methods” section), and to compare the results with the wild-type PMD (Fig. 5). The main differences between the variants and the wild type are: (1) the residues preceding the substitution are excluded from the main \(CB^{path}\) (in red), (2) the splitting of the helices into two groups, when considering long pathways, is not observed anymore and we detect only one long-range \(CB^{path}\) (in purple), (3) the latter gets shifted toward the second half as the position targeted for substitution is shifted towards the C-terminal end, (4) this is accompanied by an increase in the number of direct communications between the two halves (Fig. 5e, f and Table 2), (5) the \(CB^{clique}\) observed near the kink disappears. Moreover, we estimated the disorder tendency of the \(CBs^{clique}\), through the disorder clique score (DCS), which reflects the ability of a \(CB^{clique}\) to encompass residues from several chains (see “Methods” section). We observed that as the position targeted for substitution is shifted towards the C-terminal end, the \(CB^{clique}\), detected at the N-terminus become more indicative of disorder (Fig. 5d). We can conclude that substituting the hydrophobic a positions on the first half with a negatively charged amino acid, establishes/reinforces communication between the two halves. The roles of the two halves in the communication of the complex become almost symmetrical/equivalent.

Substitutions in the second half, after the kink

The V346D and I353D substitutions are located in the C-terminal extremity, with V346D being very close to the kink. Similar to the previous substitutions, we applied COMMA2 analysis to extract the sets of \(CBs^{path}\) and \(CBs^{clique}\), and direct communications between pairs of secondary structure elements (Fig. 6). The main differences between the variants and the wild type are: (1) the biggest \(CB^{path}\) (in red) is smaller than the corresponding block in the wild type and new blocks appear downstream the substitution position in the C-terminal half (in shades of pink), (2) considering long pathways, the splitting of the helices into two groups is not observed anymore, and we detect only one long-range \(CB^{path}\) (in purple), (3) this is accompanied by the lack or presence of only few direct communications between the chains in the second half (Fig. 6e, f and Table 2), (4) the number and size of the \(CBs^{clique}\) in the N-terminal end decrease, and (5) the DCS of \(CB^{clique}\) observed near the kink increases as it spans residues from two chains (Fig. 6d). We can conclude that substituting the hydrophobic a positions in the second half with a negatively charged amino acid, induces a breakage of communication between the first and second halves, thereby impending the propagation of communication across the structure. The structure is more rigid in the first half while it is even more flexible in the second half and chains in the second half exhibit a more independent behaviour.

Educated guess of substitutions modulating MeV and NiV PMD stability

COMMA2 analysis of the impact of various substitutions within MeV PMD elucidates the asymmetric dynamics of MeV PMD, and the role of different regions on the overall stability of the helices. Therefore, COMMA2 can be considered as a platform to reason about how to design amino-acid changes aimed at regulating flexibility and stability. Indeed, we are able to extract general rules for designing substitutions that modulate the overall stability: (1) the first half of the tetramer is more rigid, therefore introducing additional perturbations (substituting the hydrophobic a positions) in this region should result in symmetrical behaviour between the two halves of the tetramer and increase of overall stability, (2) the kink-region plays a crucial role in the propagation of communications across the tetramer, therefore perturbations at this region may lead to the asymmetry and disruption of the communications, (3) the second half is more flexible and substituting the hydrophobic a positions in this region, should increase the asymmetric dynamics and the overall flexibility, and iv) in contrast, a reverse substitution, i.e. from a negatively charged amino acid to a hydrophobic one, should increase the symmetry between the two halves. In this section, based on these rules we propose two contrasting substitutions, one increasing the flexibility, and the other bringing more stability.

The presence of a stammer (\(\hbox {L}_{{339}}\hbox {L}_{{340}}\hbox {L}_{{341}}\)) in MeV PMD induces the formation of a kink at these positions. Based on the results from the previous section, it appears that the kink plays a critical role in the communication between the two halves of the helices, a finding in agreement with the critical functional role that the kink plays in transcription and replication [11]. We thus reasoned that substituting hydrophobic positions close to the kink, may disrupt the communications. For this reason we designed a variant in which L336 was replaced with an aspartate (L336D). We applied COMMA2 and identified two \(CBs^{clique}\) (Fig. 7a). The C-term clique spans the four chains and contains a slightly increased number of residues (36 aa) compared to the wild type (Fig. 7c). The other N-terminal clique encompasses residues from three chains (16 aa). Instead of displaying one large \(CB^{path}\) spanning the two halves, like the wild type, this variant shows a sharp division between the two halves, with one \(CB^{path}\) located exclusively in the first half and the other one exclusively in the second half. Considering long-range pathways (at least 8 residues), two small blocks are detected, with a different pairing of the chains. The analysis of direct communication between helices (Fig. 7d, e) highlights a significant decrease of such communications in the first half, where the contacts are all concentrated in narrow/small regions (they are very localized and not spread over the length of each half). Overall, these results suggest an increase of flexibility at both halves and disruption of communication around the substituted position and the kink.

Next, we designed a substitution expected to bring more stability to the communication network of MeV PMD. To do this, we targeted a negatively charged residue from the disordered region (E364) and replaced it with a phenylalanine (a hydrophobic and bulk amino acid). We expected that this substitution would bring the side chains at position 364 at the interface of helices, thereby increasing the overall communication. COMMA2 analysis of E364F, revealed two \(CBs^{clique}\) (Fig. 7b). The DCS of the C-term of \(CB^{clique}\) is reduced and the N-term \(CB^{clique}\) is very similar to the C-term one (Fig. 7c). When considering all pathways, a single larger block (in red) is detected, containing 84% of the residues. The long-range pathway block is significantly larger than in the wild type, containing 46% of the residues from all the chains, extending toward the second half. Analysis of direct communications underlines a drastic increase of such communications between the helices at both halves, compared to the wild type (Fig. 7d, e). Consequently, the E364F substitution significantly increases the stability of the communication network across the protein and results in a balance for the number of residues identified as \(CB^{clique}\).

Conclusions

This study revealed that COMMA2 is able to detect a region known to be structurally ambiguous, i.e. well-ordered in a PDB structure and unresolved in another one, in the coiled-coil tetramer of the MeV PMD. The region is detected as a \(CB^{clique}\) that spans all four chains. The application of COMMA2 to the coiled-coil tetramer of the NiV PMD yielded similar results, unveiling that the C-terminus part of the NiV PMD coiled-coil tetramer also has substantial disorder content. This property is not shared by the RhcC protein, which forms a right-handed tetrameric coiled-coil. Among the available crystal structures of MeV PMD, one represents a symmetrical model, while others, including the one we simulated, have structural packing particularities, such as asymmetric kinks. However, in a previous study [11] a 1-\(\upmu\)s molecular dynamics simulation carried out on a MeV PMD symmetrical model revealed that the symmetrical model readily converts to an asymmetrical one (three \(3_{10}\)-helices plus one \(\alpha\)-helix), with the \(\alpha\)-helical arrangement being randomly distributed among the four chains over time. As such this dissymmetry likely does not arise from packing artifacts and rather reflects an inherently stable and preferred conformation. Moreover, the presence of a stammer does not entail left-handedness per se, as well exemplified by the crystal structure of a NiV PMD variant devoid of the stammer (NiV P PMD Delta 542–544, pdb code 6EB9, PMID 30799076) whose coiled coil remains left-handed. Altogether, our results show that COMMA2 can be successfully applied to different protein systems to pinpoint specific and unique structural properties. Furthermore, we proposed a score, RDP, for the prediction of disorder in coiled-coils, or multi-chain proteins. A residue with \(RDP \ge 0.5\) is predicted as disordered. Comparisons with existing tools to predict disordered residues, revealed the power of COMMA2 to predict disordered residues, as well as to distinguish different dynamical behaviours (disorder, flexibility and stability) across the structure. Finally, this analysis enabled us to formulate a set of rules for the control of MeV and NiV PMD structural stability and inter-residue ’communication’. Replacement of hydrophobic a positions with charged residues before the kink, leads to an increase in the number of direct communications in the second half and establishing communication between the two halves, which in turn enables the two halves to communicate across the structure. On the other hand, replacement of hydrophobic a positions with charged residues after the kink, leads to fragmentation of \(CBs^{path}\) and decrease of communication in the second half and concomitant significant increase of communication in the first half. Learning from these rules, we proposed two contrasting substitutions. One at the kink-region, resulting in the overall increase of the flexibility through disruption of the communications. And a reverse substitution from a charged amino acid to a hydrophobic one at the C-term, leading to an increase in symmetry and communications between the two halves. Our results complement recent experiments on a set of substitutions of hydrophobic a or d positions resulting in non-functional proteins [11], by providing detailed descriptions of their dynamics. Furthermore, COMMA2 could help in understanding why some natural coiled-coils contain charged amino acids in a or d positions. COMMA2 provides a better understanding of proteins dynamics and is able to characterize disordered regions across coiled-coils. These results advocate for the application of COMMA2 to other protein families forming homo-oligomers to highlight the role of disordered, ordered and flexible regions in their dynamics, and to determine the symmetrical or independent behaviour of the components. A measure like RDP could be adapted to other systems provided that they have several interacting chains and a correspondence between the chains could be established (which might be defined by looking at the contacts).

Availability of data and materials

All the MD trajectories and the results generated by COMMA2, i.e. CBspath and CBsclique are available at http://www.lcqb.upmc.fr/coiled_coils/.

Abbreviations

COMMA:

Communication mapping

IDR:

Intrinsically disordered regions

MD:

Molecular dynamics

PMD:

phosphoprotein multimerisation domains

MeV:

Measles virus

NiV:

Nipah virus

RMSD:

Root mean square deviations

RMSF:

Root mean square fluctuations

CB:

Communication blocks

CP:

Communication propensity

MCP:

Modified communication propensity

RDP:

Residue disorder propensity

DCS:

Disorder clique score

References

  1. 1.

    Rackham OJ, Madera M, Armstrong CT, Vincent TL, Woolfson DN, Gough J. The evolution and structure prediction of coiled coils across all genomes. J Mol Biol. 2010;403(3):480–93.

    CAS  PubMed  Article  Google Scholar 

  2. 2.

    Lupas AN, Bassler J. Coiled coils—a model system for the 21st century. Trends Biochem Sci. 2017;42(2):130–40.

    CAS  PubMed  Article  Google Scholar 

  3. 3.

    Stetefeld J, Jenny M, Schulthess T, Landwehr R, Engel J, Kammerer RA. Crystal structure of a naturally occurring parallel right-handed coiled coil tetramer. Nat Struct Biol. 2000;7(9):772–6.

    CAS  PubMed  Article  Google Scholar 

  4. 4.

    Blocquel D, Habchi J, Durand E, Sevajol M, Ferron F, Erales J, Papageorgiou N, Longhi S. Coiled-coil deformations in crystal structures: the measles virus phosphoprotein multimerization domain as an illustrative example. Acta Crystallogr Sect D Biol Crystallogr. 2014;70(6):1589–603.

    CAS  Article  Google Scholar 

  5. 5.

    Bruhn JF, Barnett KC, Bibby J, Thomas JM, Keegan RM, Rigden DJ, Bornholdt ZA, Saphire EO. Crystal structure of the Nipah virus phosphoprotein tetramerization domain. J Virol. 2014;88(1):758–62.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  6. 6.

    Özbek S, Müller JF, Figgemeier E, Stetefeld J. Favourable mediation of crystal contacts by cocoamidopropylbetaine (capb). Acta Crystallogr Sect D Biol Crystallogr. 2005;61(4):477–80.

    Article  CAS  Google Scholar 

  7. 7.

    Gutsche I, Desfosses A, Effantin G, Ling WL, Haupt M, Ruigrok RW, Sachse C, Schoehn G. Structural virology. Near-atomic cryo-EM structure of the helical measles virus nucleocapsid. Science. 2015;348(6235):704–7.

    CAS  PubMed  Article  Google Scholar 

  8. 8.

    Longhi S, Bloyet LM, Gianni S, Gerlier D. How order and disorder within paramyxoviral nucleoproteins and phosphoproteins orchestrate the molecular interplay of transcription and replication. Cell Mol Life Sci. 2017;74(17):3091–118.

    CAS  PubMed  Article  Google Scholar 

  9. 9.

    Gérard FC, Jamin M, Blackledge M, Blondel D, Bourhis J-M. Vesicular stomatitis virus phosphoprotein dimerization domain is dispensable for virus growth. J Virol. 2020;94(6):e01789–19. https://doi.org/10.1128/JVI.01789-19.

  10. 10.

    Bloyet LM, Morin B, Brusic V, Gardner E, Ross RA, Vadakkan T, Kirchhausen T, Whelan SPJ. Oligomerization of the vesicular stomatitis virus phosphoprotein is dispensable for mRNA synthesis but facilitates RNA replication. J Virol. 2020;94(13):e00115–20. https://doi.org/10.1128/JVI.00115-20.

  11. 11.

    Bloyet L-M, Schramm A, Lazert C, Raynal B, Hologne M, Walker O, Longhi S, Gerlier D. Regulation of measles virus gene expression by p protein coiled-coil properties. Sci Adv. 2019;5(5):3702.

    Article  CAS  Google Scholar 

  12. 12.

    Karlin D, Ferron F, Canard B, Longhi S. Structural disorder and modular organization in Paramyxovirinae N and P. J Gen Virol. 2003;84(Pt 12):3239–52.

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Karlin D, Longhi S, Receveur V, Canard B. The N-terminal domain of the phosphoprotein of Morbilliviruses belongs to the natively unfolded class of proteins. Virology. 2002;296(2):251–62.

    CAS  PubMed  Article  Google Scholar 

  14. 14.

    Johansson K, Bourhis JM, Campanacci V, Cambillau C, Canard B, Longhi S. Crystal structure of the measles virus phosphoprotein domain responsible for the induced folding of the C-terminal domain of the nucleoprotein. J Biol Chem. 2003;278(45):44567–73.

    CAS  PubMed  Article  Google Scholar 

  15. 15.

    Communie G, Crépin T, Maurin D, Jensen MR, Blackledge M, Ruigrok RW. Structure of the tetramerization domain of measles virus phosphoprotein. J Virol. 2013;87(12):7166–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Eaton BT, Mackenzie JS, Wang LF. Henipaviruses. In: Fields BN, Knipe DM, Howley PM, editors. Fields virology. 5th ed. Philadelphia: Lippincott-Raven; 2007. p. 1587–600.

    Google Scholar 

  17. 17.

    Broder CC. Henipavirus outbreaks to antivirals: the current status of potential therapeutics. Curr Opin Virol. 2012;2(2):176–87.

    PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Habchi J, Mamelli L, Darbon H, Longhi S. Structural disorder within Henipavirus nucleoprotein and phosphoprotein: from predictions to experimental assessment. PLoS ONE. 2010;5(7):11684.

    Article  CAS  Google Scholar 

  19. 19.

    Jensen MR, Yabukarski F, Communie G, Condamine E, Mas C, Volchkova V, Tarbouriech N, Bourhis JM, Volchkov V, Blackledge M, Jamin M. Structural description of the Nipah virus phosphoprotein and its interaction with STAT1. Biophys J. 2020;118(10):2470–88. https://doi.org/10.1016/j.bpj.2020.04.010

    CAS  PubMed  Article  Google Scholar 

  20. 20.

    Habchi J, Blangy S, Mamelli L, Jensen MR, Blackledge M, Darbon H, Oglesbee M, Shu Y, Longhi S. Characterization of the interactions between the nucleoprotein and the phosphoprotein of Henipavirus. J Biol Chem. 2011;286(15):13583–602.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y, et al. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330(6002):341–6.

    CAS  PubMed  Article  Google Scholar 

  22. 22.

    Karami Y, Laine E, Carbone A. Dissecting protein architecture with communication blocks and communicating segment pairs. BMC Bioinform. 2016;17(2):13.

    Article  CAS  Google Scholar 

  23. 23.

    Karami Y, Bitard-Feildel T, Laine E, Carbone A. ”infostery” analysis of short molecular dynamics simulations identifies highly sensitive residues and predicts deleterious mutations. Sci Rep. 2018;8(1):1–18.

    Article  CAS  Google Scholar 

  24. 24.

    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein Data bank. Nucleic Acids Res. 2000;28(1):235–42.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Case D, Darden T, Cheatham T III, Simmerling C, Wang J, Duke R, Luo R, Walker R, Zhang W, Merz K, et al. Amber 12. Univ Calif San Franc. 2012;1(2):3.

    Google Scholar 

  26. 26.

    Smith CA, Kortemme T. Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. J Mol Biol. 2008;380(4):742–56.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Berendsen HJ, Postma JPM, van Gunsteren WF, DiNola A, Haak J. Molecular dynamics with coupling to an external bath. J Chem Phys. 1984;81(8):3684–90.

    CAS  Article  Google Scholar 

  28. 28.

    Loncharich RJ, Brooks BR, Pastor RW. Langevin dynamics of peptides: the frictional dependence of isomerization rates of n-acetylalanyl-N’-methylamide. Biopolymers. 1992;32(5):523–35.

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Darden T, York D, Pedersen L. Particle mesh Ewald: an nlog(n) method for Ewald sums in large systems. J Chem Phys. 1993;98:10089–92.

    CAS  Article  Google Scholar 

  30. 30.

    Karami Y. Joint analysis of dynamically correlated networks and coevolved residue clusters: large-scale analysis and methods for predicting the effects of genetic disease associated mutations. Ph.D. thesis. 2016.

  31. 31.

    Lupas A, Van Dyke M, Stock J. Predicting coiled coils from protein sequences. Science. 1991;252(5009):1162–4.

    CAS  PubMed  Article  Google Scholar 

  32. 32.

    Dosztányi Z, Csizmok V, Tompa P, Simon I. Iupred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21(16):3433–4.

    PubMed  Article  CAS  Google Scholar 

  33. 33.

    Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT. The disopred server for the prediction of protein disorder. Bioinformatics. 2004;20(13):2138–9.

    CAS  PubMed  Article  Google Scholar 

  34. 34.

    Kim SS, Seffernick JT, Lindert S. Accurately predicting disordered regions of proteins using Rosetta residuedisorder application. J Phys Chem B. 2018;122(14):3920–30.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22(12):2577–637.

    CAS  PubMed  Article  Google Scholar 

  36. 36.

    Acharya A, Rishi V, Vinson C. Stability of 100 homo and heterotypic coiled-coil a-a’ pairs for ten amino acids (A, L, I, V, N, K, S, T, E, and R). Biochemistry. 2006;45(38):11324–32.

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Stewart CM, Buffalo CZ, Valderrama JA, Henningham A, Cole JN, Nizet V, Ghosh P. Coiled-coil destabilizing residues in the group a streptococcus m1 protein are required for functional interaction. Proc Natl Acad Sci. 2016;113(34):9515–20.

    CAS  PubMed  Article  Google Scholar 

Download references

Acknowledgements

We acknowledge the access to the HPC resources of the Institute of Computing and Data Sciences (ISCD) at Sorbonne Université.

About this supplement

This article has been published as part of BMC Bioinformatics Volume 21 Supplement 19 2020: Proceedings from the Joint NETTAB/BBCC 2019 Conference. The full contents of the supplement are available at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-21-supplement-19.

Funding

Publication charges for this article have been funded by basic annual funds from the LCQB. This work was partially undertaken in the framework of CALSIMLAB supported by the public Grant ANR-11-LABX-0037-01 constituting a part of the “Investissements d’Avenir” program (references: ANR-11-IDEX-0004-02 and ANR-10-EQPX- 29-01). It was also partially undertaken under the MAPPING Project (ANR-11-BINF-0003, Excellence Programme “Investissement d’Avenir” in Bioinformatics).

Author information

Affiliations

Authors

Contributions

YK, EL and AC conceived the overall study and designed the experiments. YK implemented COMMA2. YK, PS and RV performed computational analysis. YK, EL and AC analysed the results and wrote the manuscript. DG and SL interpreted data, contributed to the manuscript preparation and approved the final draft. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yasaman Karami or Elodie Laine or Alessandra Carbone.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

Additional material and figures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Karami, Y., Saighi, P., Vanderhaegen, R. et al. Predicting substitutions to modulate disorder and stability in coiled-coils. BMC Bioinformatics 21, 573 (2020). https://doi.org/10.1186/s12859-020-03867-x

Download citation

Keywords

  • Protein structure
  • Protein dynamics
  • Coiled-coil
  • Molecular dynamics
  • Protein disorder
  • Protein stability