Identification of metabolites from 2D 1H-13C HSQC NMR using peak correlation plots
© Öman et al.; licensee BioMed Central. 2014
Received: 11 April 2014
Accepted: 8 December 2014
Published: 16 December 2014
Identification of individual components in complex mixtures is an important and sometimes daunting task in several research areas like metabolomics and natural product studies. NMR spectroscopy is an excellent technique for analysis of mixtures of organic compounds and gives a detailed chemical fingerprint of most individual components above the detection limit. For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in 1H NMR spectra has previously been successfully employed. Similar correlation of 2D 1H-13C Heteronuclear Single Quantum Correlation spectra was recently applied to investigate the structure of heparine. In this paper, we demonstrate how a similar approach can be used to identify metabolites in human biofluids (post-prostatic palpation urine).
From 50 1H-13C Heteronuclear Single Quantum Correlation spectra, 23 correlation plots resembling pure metabolites were constructed. The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.
Correlation plots prepared by statistically correlating 1H-13C Heteronuclear Single Quantum Correlation spectra from human biofluids provide unambiguous identification of metabolites. The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.
KeywordsHSQC Correlation Metabolite Biofluid Identification
NMR (nuclear magnetic resonance) spectroscopy is well suited for analysis of complex mixtures of organic compounds and has some distinct advantages compared to other analytical techniques such as GC-MS (gas chromatography–mass spectrometry) and LC-MS (liquid chromatography-mass spectrometry). Most important, NMR spectroscopy is highly reproducible, does not require any sample derivatization and gives detailed structural information about the components of a mixture. The drawback of NMR spectroscopy is the inherent low sensitivity compared to MS-based methods, but it has nevertheless become a cornerstone in metabolomic studies .
A vast majority of NMR-based metabolomics studies have been based on 1D 1H NMR experiments because of the high sensitivity of the 1H nucleus. Recent technical advances with higher magnetic fields and the introduction of cryogenic probes have drastically increased the sensitivity and thereby reduced experimental times for inverse detection experiments of other nuclei such as 13C and 31P. This allows analyses of large data sets of dilute samples, e.g. biofluids, within a reasonable timeframe. Heteronuclear 2D NMR methods provide additional structural information and are important tools for structure elucidation of new compounds. There are a number of inverse heteronuclear 2D NMR experiments available, and the two most important are Heteronuclear Single Quantum Correlation (HSQC) and Heteronuclear Multiple-Bond Correlation (HMBC). HSQC spectra reveal the chemical shifts of 1H and X-nuclei directly bonded to each other, whereas HMBC spectra reveal correlations over multiple bonds (typically 2–3). Especially, the 1H-13C HSQC experiment has had a pivotal role in organic chemistry. In addition to being a relatively sensitive experiment, the large chemical shift range for 13C in a 1H-13C HSQC spectrum reduces spectral overlap which greatly benefits compound identification. Compared to 1D 1H NMR spectra, the 1H-13C HSQC spectrum provides a more detailed biochemical fingerprint, which has recently spurred interest in HSQC based metabolic profiling and multivariate analysis of human biofluids ,.
In order to draw biologically relevant conclusions from metabolomics studies, identification of key metabolites is required. This can be challenging considering the vast amount of metabolites present in biological samples such as human biofluids, extracts of plants or cell cultures which results in many overlapping peaks in the NMR spectra . For 1H NMR this has partly been resolved by fitting the experimental spectra to simulated or experimentally obtained spectra from single metabolites . Another interesting approach to identify metabolites is by Statistical Total Correlation Spectroscopy, STOCSY ,, which utilizes statistical correlation between peaks throughout a series of spectra. Peaks that vary in intensities in a highly correlated manner are likely to belong to the same compound. Correlations may also be observed between related compounds, e.g. metabolites belonging to the same biological pathway, but such intermolecular correlations should always be weaker than intramolecular ones. Together with established multivariate methods such as principal component analysis, PCA , or orthogonal projections to latent structures, OPLS , this approach can be used to identify metabolites that vary between different classes of samples. Since STOCSY was first reported, a number of related tools have emerged which are useful for metabolic pathway analysis as well as biomarker identification . A common denominator for these tools is that they exploit the statistical correlation between spectral data from multiple biological mixtures. On the contrary, a number of tools for statistical correlation of NMR data recorded from a single sample have also been developed. These tools include covariance NMR , indirect covariance NMR  and higher-rank correlation NMR . Covariance NMR is an alternative to traditional 2D Fourier transformation of homonuclear 2D spectra like Total Correlation Spectroscopy (TOCSY) and Nuclear Overhauser Effect Spectroscopy (NOESY). By correlating the data along the indirect dimension, highly resolved 2D correlation plots can be produced with fewer t1 increments as compared to using standard 2D Fourier transformation. Indirect covariance NMR uses the same principles to generate 2D pseudospectra from more easily obtained spectra, like 13C-13C correlation spectra from 1H-13C HSQC-TOCSY  or from a combination of 1H-1H Correlation Spectroscopy (COSY) and 1H-13C HSQC . Higher-rank correlation NMR takes it one step further, correlating 2D NMR data from two or more sources, forming 3D or higher dimensional correlation spectra. An example of this method, and relevant to the work presented in this paper, is the merging of 1H-13C HSQC and 2D 1H-13C HSQC-TOCSY spectra to form a triple rank (3R) HSQC-TOCSY spectrum . From this spectrum, HSQC spectra of individual mixture components may be extracted, providing that the involved protons belong to the same spin system. If a compound consists of multiple isolated spin systems, these correlation methods will fail to reveal all associated peaks. This is not the case for the STOCSY-like methods, since correlations do not depend on any spin-spin couplings across multiple bonds.
In STOCSY, peaks which originate from the same compound should correlate perfectly, but overlapping peaks from several metabolites in crowded regions of 1H spectra will, however, have a negative impact on the correlation. This may preclude the detection of important resonances from key metabolites. In a recent paper by Rudd et al., a STOCSY-like correlation method using 2D HSQC instead of 1D 1H NMR data was presented . This method, termed HSQCcos, was used to extract structural information from different compositions of the heterogeneous polysaccharide heparine. Contemporary with Rudd, we have worked on correlating HSQC spectra from post-prostatic palpation urine. The aim of this paper is to demonstrate that the method can be used for unambiguous metabolite identification in biofluids. With increased use of HSQC data in multivariate analysis, we envision that the HSQCcos method will become a valuable asset for interpretation of multivariate models.
Sample preparation and NMR analyses
The study was approved by The Regional Committee for Medical and Health Research Ethics (Norwegian Health Region III) and informed written consent was obtained from all 50 patients.
The 50 frozen (−80°C) urine samples from 50 different patients, collected after transrectal palpation of the prostate (three strokes over each lobe), were thawed at room-temperature for 20 minutes. Each sample (1 ml) was spun at 13000 g for 5 min and 540 μl of the supernatant was mixed with 60 μl D2O containing PBS buffer and TSP-d4, resulting in a total volume of 600 μl. The samples were vortexed and transferred to 5 mm NMR-tubes (Bruker Biospin, Rheinstetten, Germany) before analysis. The spectra were acquired using a Bruker Avance III 600 MHz spectrometer, equipped with a QCI cryoprobe. A Bruker SampleJet and ICON-NMR software (Bruker Biospin) were used to record all spectra automatically. The spectra were obtained at a constant temperature 300 K using the HSQC (hsqcetgpsisp.2) pulse sequence with 256 increments, 16 transients, a 1 s relaxation delay, sweep widths of 16 and 165 ppm and offset 4.7 and 75 ppm for the 1H and 13C dimension, respectively. The sequence was optimized for direct coupling constants of 145 Hz, which is a common compromise between aliphatic and aromatic signals. Total acquisition time for each experiment was 77.5 minutes. The data were processed with Topspin 3.2 (Bruker Biospin) using a 90° shifted qsine window function to a total of 1024 × 512 data points (F2 × F1), followed by automated baseline- and phase correction.
All spectra were calibrated relative to the TSP peak in both dimensions. Most of the metabolites were identified by comparison with reference spectra from the Human Metabolome Database (HMDB) .
Results and discussion
Identified metabolites from post-prostatic palpation urine
Selected peak ( 1H/ 13C) [ppm]
Number of correlating peaks (found/expected)
9.12 / 148.4
5 / 5
7.82 / 129.6
4 / 4
7.69 / 119.9
4 / 5
7.41 / 131.5
8 / 8
6.58 / 133.5
3 / 2
5.45 / 104.0
9 / 7
4.56 / 66.8
4 / 4
3.92 / 56.5
2 / 2
3.80 / 72.0
4 / 4
3.69 / 74.9
3 / 3
3.66 / 72.6
3 / 3
3.56 / 44.2
1 / 1
3.44 / 38.0
2 / 2
3.44 / 46.1
3 / 3
3.36 / 51.6
1 / 1
3.28 / 30.3
1 / 1
3.26 / 55.8
2 / 2
3.26 / 62.0
1 / 1
3.14 / 44.2
2 / 2
2.98 / 51.6
5 / 4
2.72 / 37.4
1 / 1
2.54 / 48.1
2 / 2
2.40 / 36.7
1 / 1
2.14 / 29.0
3 / 3
1.92 / 26.1
1 / 1
1.81 / 25.4
5 / 5
1.70 / 29.0
5 / 6
1.54 / 28.3
2 / 2
1.26 / 30.6
2 / 2
1.20 / 17.6
4 / 4
Although HSQC experiments are optimized for direct coupling 1 J 1H-13C of 145 Hz, long-range cross-peaks due to large 2 J or 3 J couplings can often be seen. These peaks are present in the original spectra at low intensities, but appear clearly in the correlation plots as they are just as highly correlated to the chosen peak as the peaks from 1 J 1H-13C couplings. These peaks resemble what you would expect to see in a 1H-13C HMBC spectrum and actually provide additional information that could benefit structural assignment. One example of such long-range cross-peak is 2.27 / 30.4 ppm (1H / 13C) as noted in Figure 3 for phenylacetylglutamine. Naturally, for metabolites at low concentration, these peaks fall below the detection limit.
Merging all the produced correlation plots gives the combined spectrum shown in Figure 2b. This spectrum also includes peaks from 7 metabolites with only one HSQC cross-peak, namely acetic acid, dimethylamine, glycine, methanol, 1-methyluric acid, succinic acid, and trimethylamine N-oxide (TMAO). These are all expected urine metabolites and their cross-peaks did not correlate with any other peaks (with correlation coefficient >0.8). Correlation plots of each individual metabolite are available in Additional file 1.
Not all cross-peaks may be accounted for, but the combined spectrum shows clear resemblance to the real HSQC spectrum in Figure 2a. Each HSQC cross-peak is usually defined by more than one data point, meaning that each data point or coordinate is likely to correlate very well with one or more of its neighbors. This explains why some peaks in Figure 2b appear broader than others, including the signal from (TMAO) at 3.26 / 62.0 ppm (1H / 13C) which is slightly phase distorted in some of the recorded HSQC spectra. Correlation to such clusters of data points can prove beneficial in cases where the number of recorded spectra is low, clearly distinguishing correlation to real cross-peaks from coincidentally correlating data points (e. g. regions with much overlapping signals).
In biofluids, and especially in urine samples, chemical shift variation can be substantial due to differences in ionic strength and pH. However, the current result shows that spectra from challenging and complex biofluids can be used to create HSQC correlation plots, without need for any peak alignment algorithm. However, in extreme cases chemical shift variation will result in low correlation between peaks belonging to the same compound. Peak alignment tools like icoshift  adapted to HSQC-spectra might remedy this. However, our results show that small deviation of chemical shifts is tolerable and the robustness of the method is demonstrated by using non-peak aligned spectra.
Selecting only one data point within each peak to create correlation plots proved very satisfactory. However, the method could be further expanded by selecting multiple data points for each cross-peak (e.g. all points within predefined 1H and 13C NMR chemical shift ranges), generating multiple correlation plots that could be merged into one. For this merged correlation plot we should expect more clusters of actually correlated cross-peaks, distinguishing them from coincidentally correlating data points.
Structure elucidation by an HSQC spectrum alone is a difficult task since it lacks the necessary long range couplings needed to identify extended spin systems. Regardless, HSQC spectra of individual metabolites represent useful fingerprints for structure confirmation, especially with more reference spectra like those from HMDB becoming available. When real reference spectra are not available, the HSQC-correlation plots may be compared to calculated spectra from quantum mechanically based NMR prediction software. In principle, similar correlation plots could be produced from other 2D NMR spectra like COSY, TOCSY or HMBC. If sample integrity is preserved during acquisition, metabolite variation should be identical within each type if 2D spectrum. This implies that a selected HSQC cross-peak not only correlates with other HSQC cross-peaks belonging to the same compound, but also e.g. the corresponding COSY cross-peaks. Combining 2D NMR spectra this way constitutes a powerful tool for the elucidation of novel compounds without tedious and often difficult chromatographic separation.
In this paper, we have shown how covariance analysis of 2D 1H-13C HSQC spectra can be used to create sub-spectra from individual metabolites in complex human biofluids. These sub-spectra are derived from the variation in metabolic composition within a series of spectra and do not depend on long–range magnetization transfer between spins. As a result, HSQC cross-peaks from isolated spin-system, separated by magnetically silent regions, are effectively displayed in the same plot. From the post-prostatic palpation urine spectra, 23 metabolites were easily identified by their sub-spectra. The results demonstrate that HSQCcos in general is a useful tool for identifying key metabolites in biofluids, producing HSQC-spectra resembling pure compounds without chromatographic separation. These spectra provide useful fingerprints for database queries. If combined with similar analyses of additional 2D NMR datasets such as COSY and/or TOCSY, complete structure elucidation could be achieved without isolating the individual components.
Availability of supporting data
The data set supporting the results of this article is included within the article and its additional files.
The NMR acquisitions were performed at the MR Core Facility, Norwegian University of Science and Technology (NTNU). We also acknowledge the Clinical Research Facility, St. Olav University Hospital for sample collection, and The Regional Biobank of Central Norway, St. Olav University Hospital for safe storage and database facilities. This study made use of the “NMR for Life” infrastructure, which is supported by the Knut and Alice Wallenberg foundation, the University of Gothenburg and Umeå University. The authors wish to thank Dr. Henrik Antti for helpful discussions about statistical analysis of spectroscopic data.
- Lindon JC, Holmes E, Nicholson JK: Toxicological applications of magnetic resonance. Prog Nucl Magn Reson Spectrosc. 2004, 45 (1–2): 109-143. 10.1016/j.pnmrs.2004.05.001.View ArticleGoogle Scholar
- Mavel S, Nadal-Desbarats L, Blasco H, Bonnet-Brilhault F, Barthélémy C, Montigny F, Sarda P, Laumonnier F, Vourc'h P, Andres CR, Emond P: 1H–13C NMR-based urine metabolic profiling in autism spectrum disorders. Talanta. 2013, 114: 95-102. 10.1016/j.talanta.2013.03.064.View ArticlePubMedGoogle Scholar
- Rai RK, Sinha N: Fast and Accurate Quantitative Metabolic Profiling of Body Fluids by Nonlinear Sampling of 1H–13C Two-Dimensional Nuclear Magnetic Resonance Spectroscopy. Anal Chem. 2012, 84 (22): 10005-10011. 10.1021/ac302457s.View ArticlePubMedGoogle Scholar
- Nicholson JK, Foxall PJ, Spraul M, Farrant RD, Lindon JC: 750 MHz 1H and 1H-13C NMR spectroscopy of human blood plasma. Anal Chem. 1995, 67 (5): 793-811. 10.1021/ac00101a004.View ArticlePubMedGoogle Scholar
- Weljie AM, Newton J, Mercier P, Carlson E, Slupsky CM: Targeted profiling: Quantitative analysis of H-1 NMR metabolomics data. Anal Chem. 2006, 78 (13): 4430-4442. 10.1021/ac060209g.View ArticlePubMedGoogle Scholar
- Cloarec O, Dumas M-E, Craig A, Barton RH, Trygg J, Hudson J, Blancher C, Gauguier D, Lindon JC, Holmes E, Nicholson J: Statistical Total Correlation Spectroscopy: An Exploratory Approach for Latent Biomarker Identification from Metabolic 1H NMR Data Sets. Anal Chem. 2005, 77 (5): 1282-1289. 10.1021/ac048630x.View ArticlePubMedGoogle Scholar
- Holmes E, Cloarec O, Nicholson J: Probing latent biomarker signatures and in vivo pathway activity in experimental disease states via statistical total correlation spectroscopy (STOCSY) of biofluids: application to HgCl2 toxicity. J Proteome Res. 2006, 5 (6): 1313-1320. 10.1021/pr050399w.View ArticlePubMedGoogle Scholar
- Jackson JE: A Users Guide to Principal Components. 1991, John Wiley, New YorkView ArticleGoogle Scholar
- Wold S, Esbensen K, Geladi P: Principal component analysis. Chemometr Intell Lab. 1987, 2 (1): 37-52. 10.1016/0169-7439(87)80084-9.View ArticleGoogle Scholar
- Trygg J, Wold S: Orthogonal projections to latent structures (O‐PLS). J Chemometr. 2002, 16 (3): 119-128. 10.1002/cem.695.View ArticleGoogle Scholar
- Robinette SL, Lindon JC, Nicholson JK: Statistical Spectroscopic Tools for Biomarker Discovery and Systems Medicine. Anal Chem. 2013, 85 (11): 5297-5303. 10.1021/ac4007254.View ArticlePubMedGoogle Scholar
- Brüschweiler R, Zhang F: Covariance nuclear magnetic resonance spectroscopy. J Chem Phys. 2004, 120: 5253-10.1063/1.1647054.View ArticlePubMedGoogle Scholar
- Zhang F, Brüschweiler R: Indirect covariance NMR spectroscopy. J Am Chem Soc. 2004, 126 (41): 13180-13181. 10.1021/ja047241h.View ArticlePubMedGoogle Scholar
- Bingol K, Salinas RK, Brüschweiler R: Higher-rank correlation NMR spectra with spectral moment filtering. J Phys Chem Lett. 2010, 1 (7): 1086-1089. 10.1021/jz100264g.View ArticlePubMed CentralPubMedGoogle Scholar
- Zhang F, Bruschweiler-Li L, Brüschweiler R: Simultaneous de novo identification of molecules in chemical mixtures by doubly indirect covariance NMR spectroscopy. J Am Chem Soc. 2010, 132 (47): 16922-16927. 10.1021/ja106781r.View ArticlePubMed CentralPubMedGoogle Scholar
- Bingol K, Brüschweiler R: Deconvolution of Chemical Mixtures with High Complexity by NMR Consensus Trace Clustering. Anal Chem. 2011, 83 (19): 7412-7417. 10.1021/ac201464y.View ArticlePubMed CentralPubMedGoogle Scholar
- Rudd TR, Macchi E, Muzi L, Ferro M, Gaudesi D, Torri G, Casu B, Guerrini M, Yates EA: Unravelling Structural Information from Complex Mixtures Utilizing Correlation Spectroscopy Applied to HSQC Spectra. Anal Chem. 2013, 85 (15): 7487-7493. 10.1021/ac4014379.View ArticlePubMedGoogle Scholar
- Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, Djoumbou Y, Mandal R, Aziat F, Dong E, Bouatra S, Sinelnikov I, Arndt D, Xia J, Liu P, Yallou F, Bjorndahl T, Perez-Pineiro R, Eisner R, Allen F, Neveu V, Greiner R, Schalbert A: HMDB 3.0-The Human Metabolome Database in 2013. Nucleic Acids Res. 2013, 41 (D1): D801-D807. 10.1093/nar/gks1065.View ArticlePubMed CentralPubMedGoogle Scholar
- Posada-Ayala M, Zubiri I, Martin-Lorenzo M, Sanz-Maroto A, Molero D, Gonzalez-Calero L, Fernandez-Fernandez B, de la Cuesta F, Laborde CM, Barderas MG: Identification of a urine metabolomic signature in patients with advanced-stage chronic kidney disease. Kidney Int. 2013, 85: 103-111. 10.1038/ki.2013.328.View ArticlePubMedGoogle Scholar
- Jewison T, Knox C, Neveu V, Djoumbou Y, Guo AC, Lee J, Liu P, Mandal R, Krishnamurthy R, Sinelnikov I: YMDB: the yeast metabolome database. Nucleic Acids Res. 2012, 40 (D1): D815-D820. 10.1093/nar/gkr916.View ArticlePubMed CentralPubMedGoogle Scholar
- Kang S-M, Park J-C, Shin M-J, Lee H, Oh J, Ryu DH, Hwang G-S, Chung JH: 1H nuclear magnetic resonance based metabolic urinary profiling of patients with ischemic heart failure. Clin Biochem. 2011, 44 (4): 293-299. 10.1016/j.clinbiochem.2010.11.010.View ArticlePubMedGoogle Scholar
- Matsumoto M, Zhang C, Kosugi C, Matsumoto I: Gas chromatography–mass spectrometric studies of canine urinary metabolism. J Vet Med Sci. 1995, 57 (2): 205-211. 10.1292/jvms.57.205.View ArticlePubMedGoogle Scholar
- Savorani F, Tomasi G, Engelsen SB: icoshift: A versatile tool for the rapid alignment of 1D NMR spectra. J Magn Reson. 2010, 202 (2): 190-202. 10.1016/j.jmr.2009.11.012.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.