- Open Access
Automated NMR relaxation dispersion data analysis using NESSY
BMC Bioinformatics volume 12, Article number: 421 (2011)
Proteins are dynamic molecules with motions ranging from picoseconds to longer than seconds. Many protein functions, however, appear to occur on the micro to millisecond timescale and therefore there has been intense research of the importance of these motions in catalysis and molecular interactions. Nuclear Magnetic Resonance (NMR) relaxation dispersion experiments are used to measure motion of discrete nuclei within the micro to millisecond timescale. Information about conformational/chemical exchange, populations of exchanging states and chemical shift differences are extracted from these experiments. To ensure these parameters are correctly extracted, accurate and careful analysis of these experiments is necessary.
The software introduced in this article is designed for the automatic analysis of relaxation dispersion data and the extraction of the parameters mentioned above. It is written in Python for multi platform use and highest performance. Experimental data can be fitted to different models using the Levenberg-Marquardt minimization algorithm and different statistical tests can be used to select the best model. To demonstrate the functionality of this program, synthetic data as well as NMR data were analyzed. Analysis of these data including the generation of plots and color coded structures can be performed with minimal user intervention and using standard procedures that are included in the program.
NESSY is easy to use open source software to analyze NMR relaxation data. The robustness and standard procedures are demonstrated in this article.
Proteins are responsible for many different and important biochemical functions, such as ligand binding, signaling or catalysis [1–4]. Many of these events occur on the μs-ms timescale making the detection and interpretation of these motions critical for understanding their biological significance. Advances in solution Nuclear Magnetic Resonance (NMR) of biomolecules have enabled the measurement of protein dynamics on different timescales. Relaxation experiments measuring the two relaxation rates R1 (the longitudinal relaxation rate) and R2 (the transverse relaxation rate) as well as the steady state heteronuclear NOE are used to detect motion within ps-ns . However, detection of these motions is limited by the molecular tumbling of the molecule, which is on the order of ns. Recent developments in 15N/13C relaxation-compensated Carr-Purcell-Meiboom-Gill (CPMG) and R1ρ rotating-frame relaxation dispersion experiments [6–9] have enabled measurement of protein dynamics on the μs-ms timescale. From these experiments the exchange contribution to transverse relaxation (R ex ), exchange constant (k ex ), chemical shift differences between exchanging states (δω) and the population of individual states can be extracted. Consequently, relaxation dispersion experiments are commonly used to investigate changes in dynamics caused by ligand binding , to discover limiting events during enzyme catalysis , to follow protein folding and intermediate states as well as to detect functionally important hidden states . Furthermore, it has been proposed that a detailed understanding of the dynamical behavior of a protein at atomic resolution may help identify novel drug target sites .
Here, we present a novel software (NESSY) that is designed for simple and automated analysis of relaxation dispersion data. Data can be fitted to two states and at one or more field strengths. Model selection protocols are included for the selection of the best models and Monte Carlo simulations are used to estimate errors of extracted parameters. Furthermore, publication quality 2D and 3D plots and color coded structures are easily created.
The software NESSY is written in Python http://www.python.org for multi platform compatibility and uses wx.Python http://www.wxpython.org to build the graphical user interface (Figure 1). The program was tested on Windows (Vista and 7) and Linux (Ubuntu 9.10 to 11.04) and can be downloaded for free from http://nessy.biochem.unimelb.edu.au either as compiled binaries (Linux, Mac or Windows format, including all Python modules) or as source code. The software is open source and general public licensed (GPL v3).
NESSY performs automatic relaxation dispersion data analysis using NMR peak lists as inputs. In a first step, the effective transverse relaxation rate R2eff is extracted from a series of CPMG relaxation dispersion experiments according to:
where T CPMG is the constant CPMG time, I(0) is the intensity of the peak in the reference spectrum and I(v CPMG ) is the intensity of the peak at the CPMG frequency, vCPMG. Then, data are fitted to different models, which can be selected individually. These models are divided on the basis of no exchange (model 1), two-site fast (model 2) and two-site slow (model 3) exchange.
No exchange: model 1
Model 1 describes residues, which are not involved in exchange processes:
where is the effective transverse relaxation rate at infinite vCPMG.
Two-state exchange: model 2 and 3
Equation (3) describes exchange between two states A and B
Where k a-b and k b-a are the forward and backward exchange rates, respectively, between the states A and B.
and p a and p b are the populations of the two state models (p a is the major conformation, p a + p b = 1), ke xis the chemical/conformational exchange (kex = k a-b + k b-a ) constant and δω is the chemical shift difference between states. In eq. (4) only , k ex and Φ can be extracted, as p b and, δω cannot be uniquely determined.
Model 3 describes slow-limit exchange (kex << δω) according to the Richard-Carver equation :
NESSY also automatically calculates the exchange contribution to transverse relaxation, R ex , where for fast exchange, R ex = Φ/k ex , and slow exchange, R ex = p a p b k ex /(1 + (k ex /δω)2).
Optimization and Model Selection
Exchange parameters are optimized by using the Levenberg-Marquardt algorithm to minimize the target function
where is the value at each v CPMG at data point n, is the calculated rate constant. Error of R2eff was assumed to be constant within each residue and calculated according to the definition of pooled variance (eq. 7) . is calculated using N dup replicated experiments for v CPMG values, such that n j replicates were obtained at (v CPMG )j. The effective transverse relaxation rate of a given peak in these n j replicates has the standard deviation s j :
Model selection is performed using AICc (Akaike information criteria with second order correction for small sample size) by default, but other tests, such as AIC or F-test are included as well . Uncertainties are estimated using 500 (default value, but user controlled) Monte Carlo simulations.
To demonstrate validity of fit and model selection performed by NESSY, synthetic data was produced using the in-built Synthetic Data Creator. Data were created for models 2 and 3 by introducing an error of 2, 5, 8 and 10% (see Table 1) as experimental NMR relaxation dispersion data usually have smaller errors than 5%. Data was analyzed using NESSY and fitted to the models described above.
Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion experiments of a 0.3 mM PCTX1 sample (kindly provided by Prof. G. King, University of Queensland, Australia) were recorded on a Bruker 600 MHz AVANCE III spectrometer. Experiments were acquired at 298 K using 0.08 s constant CPMG period (TCPMG) as a single scanned interleaved pseudo 3D experiment. Relaxation dispersion profiles were obtained by recording spectra with varying CPMG pulse frequencies (νCPMG) (25, 2× 50, 75, 100, 2× 150, 2× 200, 300, 500, 600, 700, 900, 1000, 1500 and 2000 Hz). Spectra were processed in NMRPipe  and intensities extracted using CcpNmr .
To demonstrate the validity of the coding and model selection, synthetic data containing 2, 5, 8 and 10% errors for models 2 and 3 were created using the in-built tool in NESSY. This synthetic data was fitted to models 1, 2 and 3 by minimizing the χ2 function (eq. 6) and the model selection was performed by comparing AICc values. Errors for the extracted dynamics parameters were estimated by 500 Monte Carlo simulations. In all cases, irrespective of the noise, the correct model was chosen by AICc for each data set (Tables 1 and 2). Notably, for the synthetic-data model 2 the χ2 values calculated for model 3 were always at least as good as those for model 2, indicating the need for model selection by statistical methods such as AICc. For the synthetic-data model 3, both χ2 and AICc selected model 3. During the calculation, NESSY creates plots for each fit to a model. For example, a comparison of fits to models 1, 2 and 3 for the synthetic data is shown in Figure 2 (3D plots are generated using the NESSY plotting tool).
The parameters k ex , p b and δω that were extracted for model 3 matched the initial parameters used to generate the synthetic data within the order of the introduced error (Table 1). As expected, introducing larger errors into the synthetic data sets produced larger errors for the extracted parameters (calculated by Monte Carlo Simulations).
Fitting relaxation dispersion experiments of PCTX1 at 600 MHz
NESSY was tested using NMR data acquired at 600 MHz for PCTX1, a 40-residue spider toxin . Data for two resonances showing frequency dependent R ex were fitted to models 1 to 3. Trp 24 best fitted to model 3 (two-state, slow-limit exchange) and Phe 30 to model 2 (two-state, fast-limit exchange). An overview of fits to model 1 to 3 as well as the best fit is shown in Figure 3. Fitting of the slow-limit exchanging signal of Trp 24 extracts a k ex of 1029.3 ± 194 1/s. Although the fit shows slow-limit exchange, only one resonance was present in the spectrum. This is in accordance to the extracted population of the minor conformation (p b ) of 0.844 ± 0.002% and a chemical shift difference (δω) for the two conformations of 2884 ± 76 rad/s. As residue Phe 30 is in fast exchange, only k ex and Φ were extracted. The extracted conformation exchange rate (k ex ) for this residue was 3750.2 ± 162 1/s.
To best illustrate calculated motion, NESSY creates PyMol macros for color and width-coded structures. Such structures are created for model selection, k ex and R ex . Figure 4 shows color coding for both model selection and k ex of PCTX1. Models are visualized in different colors per residue, while extracted dynamics parameters (k ex and R ex ) are colored from yellow to red and also represented by thicker lines for increasing values. Note that in Figure 4 only two residues were analyzed and therefore, only two residues are colored.
Global fit to synthetic data
The dynamics parameters, k ex , p b and δω, are highly interdependent and data from a single static field is typically insufficient for their accurate determination. Consequently acquisition of relaxation dispersion data at two or more fields is necessary for the robust and accurate estimation of these parameters . NESSY is able to perform a global fit by sharing parameters k ex , p b and δω, but taking the field dependency of δω into account. To demonstrate this function, synthetic data was created for data virtually recorded at 800 and 600 MHz. For 800 MHz data, synthetic data (5% error) previously described was used. Additionally, synthetic data was generated at 600 MHz (δω 600 = 600/800 δω 800 ). Both data sets were simultaneously fitted to models 1 to 3 (Table 3). Model selection was performed using AICc and errors of fit for the parameters were estimated by 500 Monte Carlo simulations per residue and individual experiments.
For the synthetic data for model 2 at two different spectrometer frequencies, fits (χ2) generally improved with increasing number of parameters. Using AICc model selection NESSY clearly chooses the correct model (model 2). For the synthetic data for model 3, the correctly chosen model according to AICc was model 3 (Table 3). A summary of extracted parameters is given in Table 4 and fits to models 2 and 3 for both synthetic data sets (each at two spectrometer frequencies) are shown in Figure 5.
Under certain circumstances, motion in different parts of a macromolecule may be concerted. Therefore, multiple residues/nuclei may experience the same exchange. In such cases, it is useful to cluster these residues and fit them simultaneously. Cluster analysis of data recorded at one or more static fields is included in NESSY, where single values of kex and populations are determined for all residues. Individual parameters, δω and ϕ (eq. 4) are obtained for each residue. To demonstrate this function, synthetic data for 4 residues experiencing kex of 306.15 1/s and a minor population (pb) of 0.072 were created using model 3 (2 state slow-limit exchange) consisting of 5% noise. The transverse relaxation rate, , as well as δω were different for each residue. The data was clustered and simultaneously fitted to models 2 or 3 (Figure 6A). The selected model by AICc was model 3 and the extracted exchange constant kex was 286.69 ± 66.5 1/s and population pb was 0.075 ± 0.013. Furthermore, the extracted values for δω and correlated to their initial values (Figure 6B, C).
The software presented in this article permits the automated data analysis of relaxation dispersion experiments. The user has the opportunity to choose to fit to the entire data set or individually selected residues. In addition, data from two or more magnetic fields can be globally fitted. During global fits, field dependency of δω is taken into account. Multiple residues can be grouped and analyzed simultaneously using built-in cluster analysis (individual or global fit).
Maximum flexibility of data entry has been included. For example, peak lists containing peak intensities that are created by any other software can be imported; protein sequences are read either from PDB files, retrieved from the internet using UniProt identifier http://www.expasy.ch or can be added manually; CPMG frequencies are read directly from Bruker VD (variable delay) lists. Furthermore, NESSY is linked to the Bruker Protein Dynamic Center (PDC, starting with version 1.1, http://www.bruker-biospin.com, in collaboration with Dr. Klaus-Peter Neidig) so that projects can be exported in PDC and directly read into NESSY for extended data analysis. In addition, NMRView  tables can be imported directly.
During the calculation, NESSY produces plots and CSV files (text files compatible with Excel and OpenOffice) of the profiles and the individual fits for each calculated residue as well as the extracted values (such as R 2 , k ex and p b ) and statistics (χ2 and AICc) for each model. In addition, color coded structures for selected models, k ex and R ex are created (structures are drawn using Pymol). Furthermore, NESSY offers tools to create custom made 2D and 3D plots in different formats and styles suitable for publication (see Figures 2 to 6).
Synthetic data as well as experimental NMR CPMG relaxation dispersion data were analyzed using NESSY. The quality of fits of the synthetic data was assessed by comparing original values to those extracted from data analysis (Table 1 and 2, Figure 2). For choosing the best fitting model NESSY evaluates the need and benefit for describing the experimental data with more parameters by using AICc, AIC or F-test. By default, model selection is performed using AICc to avoid over fitting, as more parameters (more complex models) may give better fits. The advantage of using AICc (and AIC) compared to F-test is that data do not have to be normally distributed. As relaxation dispersion experiments usually consist of 15 to 20 data sets, normal distribution of data cannot be assumed. For the synthetic data the correct model was consistently chosen and the extracted values were statistically similar to the initial values. To demonstrate usability of NESSY for NMR data, two signals of PCTX1 were analyzed (Figures 3 and 4). One signal experienced slow-limit and the other fast-limit chemical exchange. In the case of slow exchange, the minor populated conformation was present at 0.8%, which is in accordance with the NMR spectra, where only one peak is visible.
To be able to unambiguously distinguish between fast and slow exchange and to extract populations and shift differences, experiments should be collected at two or more different magnetic fields . NESSY supports global fitting at multiple field strengths, while taking the field dependency of δω into account. Synthetic data for models 2 and 3 at two different static frequencies (600 and 800 MHz) were created and analyzed. Data were fitted to models 1 to 3 and in each case the correct model was chosen for both global fits (Tables 3 and 4, Figure 5).
In this article, the main features of NESSY have been presented, such as curve fitting, model selection and data presentation of relaxation dispersion experiments. Nevertheless, NESSY is not limited to these functions. As relaxation dispersion experiments enable the extraction of populations of individual states, NESSY offers tools to calculate the free energy (ΔG) between states. For data that fit best to fast exchange models, the populations of each state cannot be extracted, as k ex and δω cannot be uniquely determined. An integrated function in NESSY is to calculate the populations for residues with known chemical shift differences, such as observed in ligand binding experiments. If a series of experiments are recorded over a temperature range, the temperature dependence of ΔG can be used to extract entropy (ΔS) and enthalpy (ΔH) changes using van't Hoff analysis. NESSY supports automatic van't Hoff analysis of both linear and non-linear models . Furthermore, NESSY can calculate activation energy barriers using transition state theory and the Eyring equation and generate energy landscape plots.
Taken together, we present user-friendly software for NMR relaxation dispersion (15N/13C) data analysis that requires minimal user intervention. In addition, NESSY can be used to analyze biophysical experiments, such as van't Hoff and transition state theory analysis and to create publication quality 2D and 3D plots. Due to its flexibility, users can choose between different relaxation dispersion models and statistical tests for model selection. Results generated by NESSY are aimed to be usable for publication. Tables can be produced directly using CSV files; 2D and 3D plots are created during or after analysis and color coded structures can be directly used in publications. Note that each figure display in this article was created in NESSY. This software is open source and freely available from http://nessy.biochem.unimelb.edu.au. NESSY comes with a detailed manual and tutorial. For additional help, questions can be addressed at the NESSY mailing list (email@example.com).
Availability and requirements
Project name: NESSY
Project home page: http://nessy.biochem.unimelb.edu.au
Operating system(s): Platform independent
Programming language: Python
Other requirements: Scipy, Numpy, wxPython, Matplotlib (None for compiled binaries)
License: GNU GPL v3
Any restrictions to use by non-academics: None
Akaike's Information Criteria
Akaike's Information Criteria with second order correction for small sample size
NMR Relaxation Dispersion Spectroscopy Analysis Software
Nuclear Magnetic Resonance
Nuclear Overhauser Effect.
Grzesiek S, Sass HJ: From biomolecular structure to functional understanding: new NMR developments narrow the gap. Curr Opin Struct Biol 2009, 19(5):585–595. 10.1016/j.sbi.2009.07.015
Cabodi S, Di Stefano P, Leal Mdel P, Tinnirello A, Bisaro B, Morello V, Damiano L, Aramu S, Repetto D, Tornillo G, et al.: Integrins and signal transduction. Adv Exp Med Biol 2010, 674: 43–54. 10.1007/978-1-4419-6066-5_5
Johnson-Winters K, Tollin G, Enemark JH: Elucidating the catalytic mechanism of sulfite oxidizing enzymes using structural, spectroscopic, and kinetic analyses. Biochemistry 2010, 49(34):7242–7254. 10.1021/bi1008485
Tabernero L, Aricescu AR, Jones EY, Szedlacsek SE: Protein tyrosine phosphatases: structure-function relationships. FEBS J 2008, 275(5):867–882. 10.1111/j.1742-4658.2008.06251.x
Palmer AG: NMR probes of molecular dynamics: overview and comparison with other techniques. Annu Rev Biophys Biomol Struct 2001, 30: 129–155. 10.1146/annurev.biophys.30.1.129
O'Connell NE, Grey MJ, Tang Y, Kosuri P, Miloushev VZ, Raleigh DP, Palmer AG: Partially folded equilibrium intermediate of the villin headpiece HP67 defined by 13C relaxation dispersion. J Biomol NMR 2009, 45(1–2):85–98. 10.1007/s10858-009-9340-0
Wang C, Grey MJ, Palmer AG: CPMG sequences with enhanced sensitivity to chemical exchange. J Biomol NMR 2001, 21(4):361–366. 10.1023/A:1013328206498
Loria JP, Berlow RB, Watt ED: Characterization of enzyme motions by solution NMR relaxation dispersion. Acc Chem Res 2008, 41(2):214–221. 10.1021/ar700132n
Cavanagh J, Venters RA: Protein dynamic studies move to a new time slot. Nat Struct Biol 2001, 8(11):912–914. 10.1038/nsb1101-912
Brath U, Akke M: Differential responses of the backbone and side-chain conformational dynamics in FKBP12 upon binding the transition-state analog FK506: implications for transition-state stabilization and target protein recognition. J Mol Biol 2009, 387(1):233–244. 10.1016/j.jmb.2009.01.047
Baldwin AJ, Kay LE: NMR spectroscopy brings invisible protein states into focus. Nat Chem Biol 2009, 5(11):808–814. 10.1038/nchembio.238
Peng JW: Communication breakdown: protein dynamics and drug design. Structure 2009, 17(3):319–320. 10.1016/j.str.2009.02.004
Luz Z, Meiboom S: Nuclear magnetic resonance study of protolysis of trimethylammonium ion in aqueous solution - order of reaction with respect to solvent. Journal of Chemical Physics 1963, 39(2):366. 10.1063/1.1734254
Carver JP, Richards RE: General 2-site solution for chemical exchange produced dependence of T2upon carr-purcell pulse separation. Journal of Magnetic Resonance 1972, 6(1):89.
Demers JP, Mittermaier A: Binding mechanism of an SH3 domain studied by NMR and ITC. J Am Chem Soc 2009, 131(12):4355–4367. 10.1021/ja808255d
d'Auvergne EJ, Gooley PR: The use of model selection in the model-free analysis of protein dynamics. J Biomol NMR 2003, 25(1):25–39. 10.1023/A:1021902006114
Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A: NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 1995, 6(3):277–293.
Vranken WF, Boucher W, Stevens TJ, Fogh RH, Pajon A, Llinas M, Ulrich EL, Markley JL, Ionides J, Laue ED: The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins 2005, 59(4):687–696. 10.1002/prot.20449
Escoubas P, De Weille JR, Lecoq A, Diochot S, Waldmann R, Champigny G, Moinier D, Menez A, Lazdunski M: Isolation of a tarantula toxin specific for a class of proton-gated Na+ channels. J Biol Chem 2000, 275(33):25116–25121. 10.1074/jbc.M003643200
Kovrigin EL, Kempf JG, Grey MJ, Loria JP: Faithful estimation of dynamics parameters from CPMG relaxation dispersion measurements. Journal of Magnetic Resonance 2006, 180(1):93–104. 10.1016/j.jmr.2006.01.010
Johnson BA, Blevins RA: NMR View - a computer-program for the visualization and analysis of nmr data. J Biomol NMR 1994, 4(5):603–614. 10.1007/BF00404272
Niedzwiecka A, Stepinski J, Darzynkiewicz E, Sonenberg N, Stolarski R: Positive heat capacity change upon specific binding of translation initiation factor eIF4E to mRNA 5' cap. Biochemistry 2002, 41(40):12140–12148. 10.1021/bi0258142
This work was supported by the Australian Research Council (ARC) and an equipment grant of the Rowden White Foundation (University of Melbourne). M.B. is a recipient of Swiss National Science Foundation (SNF) fellowships. The authors also thank Edward d'Auvergne for the comprehensive literature review about the hidden radian unit, which can be accessed in the NESSY help menu.
The authors declare that they have no competing interests.
MB wrote the code, conducted the NMR experiments and data analysis and wrote the manuscript. PRG supervised the project, assisted in acquisition and processing of NMR experiments and wrote the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
About this article
Cite this article
Bieri, M., Gooley, P.R. Automated NMR relaxation dispersion data analysis using NESSY. BMC Bioinformatics 12, 421 (2011). https://doi.org/10.1186/1471-2105-12-421
- Protein dynamics
- conformational/chemical exchange
- μs-ms motion
- van't Hoff
- transition state theory