- Open Access
High-throughput peptide quantification using mTRAQ reagent triplex
© Yoon et al; licensee BioMed Central Ltd. 2011
- Published: 15 February 2011
Protein quantification is an essential step in many proteomics experiments. A number of labeling approaches have been proposed and adopted in mass spectrometry (MS) based relative quantification. The mTRAQ, one of the stable isotope labeling methods, is amine-specific and available in triplex format, so that the sample throughput could be doubled when compared with duplex reagents.
Methods and results
Here we propose a novel data analysis algorithm for peptide quantification in triplex mTRAQ experiments. It improved the accuracy of quantification in two features. First, it identified and separated triplex isotopic clusters of a peptide in each full MS scan. We designed a schematic model of triplex overlapping isotopic clusters, and separated triplex isotopic clusters by solving cubic equations, which are deduced from the schematic model. Second, it automatically determined the elution areas of peptides. Some peptides have similar atomic masses and elution times, so their elution areas can have overlaps. Our algorithm successfully identified the overlaps and found accurate elution areas. We validated our algorithm using standard protein mixture experiments.
We showed that our algorithm was able to accurately quantify peptides in triplex mTRAQ experiments. Its software implementation is compatible with Trans-Proteomic Pipeline (TPP), and thus enables high-throughput analysis of proteomics data.
- Isotopic Peak
- Stable Isotope Label
- Isotopic Cluster
- Data Analysis Algorithm
- Peptide Quantification
Introduction of mass spectrometry (MS) provides massive biological information of proteins for both qualitative and quantitative analysis . Recently, quantitative analyses have become of particular interest in proteomics research . To determine the expressional differences of proteins across samples representing different physiological or disease states, various experimental approaches have been developed: spectral counting, stable isotope labeling, and label-free quantification .
Stable isotope labeling is one of popular methods for protein quantification. Peptides of two or more samples are differently labeled using stable isotopes to introduce mass shifts. Then they are experimented within a single LC/MS run, so that the sample throughput could be multiplied when compared with that of label-free quantification. There are various labeling techniques: ICAT , SILAC , 18O labelling , iTRAQ , mTRAQ , and so on. Numerous computational tools for the stable isotope labeling have also been developed, including XPRESS , ASAPRatio , STEM , ZoomQuant , MSInspect , Multi-Q , Q3 , VIPER , MaxQuant , Census , and IEMM .
In this paper, we focus on the isotope label mTRAQ, which is a nonisobaric variant of the iTRAQ and was originally designed for multiple reaction monitoring (MRM) . The mTRAQ labels were first designed in two chemically identical versions. The heavy-label is identical to the iTRAQ 117 label and its mass is 145 Da. The light-label is chemically identical to the heavy-label, but it has no 13C or 15N, so its mass is 141 Da. They are labeled at lysine residue and N-terminal. We verified that the mTRAQ is a powerful isotope label for MS-based relative quantification , and developed a new algorithm to improve the accuracy of peptide quantification in mTRAQ labeling based MS experiments . Recently, the mTRAQ has become available in triplex format, where the label with 149 Da is added.
One of the major obstacles to accurate peptide quantification is the overlap of isotopic clusters. There are two types of overlap problems, one is the overlap between differently labeled peptides, and the other is the overlap between chemically different peptides. The former can happen when the mass difference between labels is very small. In mTRAQ experiments, the mass difference between differently labeled peptides is 4 Da if the original peptide has no lysine, so it is important to separate their isotopic clusters correctly. The latter could be found in all kinds of MS-based experiments. For peptide quantification, most of the times we are interested in relative quantification of peptides whose amino acid sequences are known. When we know the sequences of peptides of interest, there are better chances to recognize the overlaps from differential labeling by comparing them to the theoretical isotopic distributions.
In this manuscript, we present a new data analysis algorithm for peptide quantification in triplex mTRAQ experiments. It is an extension of the algorithm for duplex mTRAQ experiments . We identify isotopic clusters of triplex labeled peptides and separate their intensities using cubic equation modelling when there are overlaps. We also designed an automatic determination algorithm for the elution area of peptides, which could recognize the overlap between chemically different peptides. We demonstrate the performance of our algorithm using standard protein mixture experiments.
Preparation of standard samples
Standard protein mixtures
cytochrome c (CYCS)
The standard protein mixtures were labeled with mTRAQTM reagent (AB Sciex, Foster City, CA, USA) as described in  and . Proteins were reduced with 50 mM tris (2-carboxyethyl) phosphine (Thermo Fisher Scientific, Rockford, IL, USA) for 1 hr at 60 °C, treated with 200 mM methyl methanethiosulfonate (MMTS; Tokyo Chemical Industry, Tokyo, Japan) for 10 min at 25 °C, and then diluted 10 fold with 50 mM Tris (pH 8.0), and digested with sequencing-grade trypsin (Promega, Madison, WI, U.S.A.) at 37 °C overnight at the protein:trypsin molar ratio of 40:1. Tryptic digests were desalted with C18 solid-phase extraction cartridge and dried in vacuo. The dried samples were reconstituted in 500 mM triethylammonium bicarbonate (Sigma-Aldrich St Louis, MO, USA) and incubated with appropriate mTRAQ reagents at 25 °C for 1 hr. For the Set1 experiment, Std1 was labeled with mTRAQ® ∆0 (Light), Std2 with mTRAQ® ∆4 (Medium), and Std3 with mTRAQ® ∆8 (Heavy). For the Set2 experiment, Std1 was labeled with Heavy reagent, Std2 with Medium, and Std3 with Light (Table 1). After the labeling reaction, samples were dried in vacuo, redissolved in 0.1% trifluoroacetic acid, mixed equally, desalted with a mixed-mode strong cation-exchange (MCX) cartridge and dried again.
Mass spectrometric analyses of mTRAQ labeled samples
Labeled sample mixtures were reconstituted in 0.4% acetic acid and an aliquot (~1 μg) was injected to a reversed-phase Magic C18aq column (15 cm x 75 μm) on an Eksigent multi-dimensional liquid chromatography (MDLC) system at the flow rate of 300 nL/min. The column was equilibrated with 95% buffer A (0.1% formic acid in H2O) + 5 % buffer B (0.1% formic acid in acetonitrile) prior to use. The peptides were eluted with a linear gradient of 10 to 40% Buffer B over 40 min.
The high performance liquid chromatography (HPLC) system was coupled to a linear trap quadrupole (LTQ) XL-Orbitrap mass spectrometer (Thermo Scientific, San Jose, CA, U.S.A.). The spray voltage was set to 1.9 kV, and the temperature of the heated capillary was set to 250 °C. Survey full-scan MS spectra (m/z 300–2,000) were acquired in the Orbitrap with 1 microscan and a resolution of 100,000 allowing the preview mode for precursor selection and charge-state determination. MS/MS spectra of the five most intense ions from the preview survey scan were acquired in the ion-trap concurrently with full-scan acquisition in the Orbitrap with the following options: isolation width, ±10 ppm; normalized collision energy, 35%; dynamic exclusion duration, 30 sec. Precursors with unmatched and single charge states were discarded during data dependant acquisition. Data were acquired using the Xcalibur software v2.0.7 (Thermo Scientific).
Database searching of MS/MS data for peptide identification
The data files collected on the mass spectrometer (.raw) were converted to MGF format by use of Trans-Proteomic Pipeline (TPP, version 4.3 JETSTREAM rev 1, http://www.proteomecenter.org), which is an open source proteomics analysis tool. The data were then searched using MASCOT  (version 2.2.06) against a compound database consisting of the International Protein Index (IPI, European Bioinformatics Institute, http://www.ebi.ac.uk/IPI) bovine database (version 3.42) and IPI human database (version 3.57) totaling 107,511 protein entries, allowing the options of trypsin, ±0.5 Da mass tolerance for fragment ion, ±15 ppm mass tolerance for precursor ion, variable modifications of mTRAQ Light (+140.095 Da), mTRAQ Medium (+144.1021 Da) and mTRAQ Heavy (+148.1092 Da) on peptide N-terminus and Lys residue. A fixed modification of MMTS (+45.9877 Da) on Cys residue and a variable modification of Met oxidation (+15.9949) were also allowed. TPP was used for the validation of database search results. Peptides with TPP peptide probability greater 0.9 and MASCOT E-value less than 0.01 were used for further quantification analysis.
Overview of the algorithm
Our algorithm is designed to be executed within TPP. For each LC/MS experiment, TPP generates a pepXML file which contains a list of peptides with sequences, tandem scans, charges, and modifications. Our algorithm calculates medium to light (M/L) and heavy to light (H/L) ratios of peptides in pepXML files and produces new pepXML files that can be used for further analysis. For each peptide, our algorithm first determines its elution area. It then identifies triplex isotopic clusters and calculates M/L and H/L ratios for each MS scan contained in the elution area. Finally, each of the set of M/L and H/L ratios is integrated based on linear regression.
Model of overlapping isotopic clusters
where n is the number of peaks in the isotopic distribution of a peptide, L k , M k , and H k are the intensities of the k th peaks of the isotopic distributions of the light, medium, and heavy-labeled peptides, respectively.
where T k is the intensity of the k th peak of the theoretical isotopic distribution of the peptide. (The EMASS algorithm was used to calculate T k values .) The error value should be very small for the correct ratio pair because L k +4/L k , M k +4/M k , and H k +4/H k are theoretically the same as T k +4/T k . Therefore, we calculated the error value for each candidate pair and select the pair with the lowest error value. After all pairs for 1 ≤ k ≤ 4 are selected, we can calculate the M/L ratio and the H/L ratio .
Determination of the elution areas of peptides
In most LC/MS experiments, tandem MS scans are acquired using dynamic exclusion (DE). For each MS/MS scan, therefore, we know only one MS scan where the identified peptide is eluted. We need to determine the elution area of the peptide as it is eluted over a period of time. However, some peptides have similar atomic masses and elution times, so their elution areas can have overlaps. A naive approach such as using a fixed range (e.g. within ±30s from the tandem scan of peptides) has a risk of including incorrect MS scans where other peptides are overlapped. Therefore, it is very important to determine accurate elution areas of the peptides for accurate relative quantification.
We assume that the distribution of peptide elution time can be approximated as a normal distribution. Because of noise and overlap of peptides, MS scans with low intensities at both ends of the elution area may not be trusted. If we use only MS scans with high total ion current while modeling the elution profile as a normal distribution, the mean μ of the normal distribution can be approximated, but the variance σ2 can’t. Instead, we use the full width at half maximum (FWHM) to induce σ2. From the probability density function of the normal distribution, we deduce and obtain σ2 = FWHM2/8 ln 2.
Our algorithm calculates M/L and H/L ratios for all MS scans in the elution area. Then, each of the set of M/L and H/L ratios is integrated by linear regression using the form “y = cx”. The intensities of peaks are split into the intensity of light-, medium-, and heavy-labeled peptide. We estimate c using the set of intensities of light-labeled peptides as x i ’s, and the set of intensities of medium- and heavy-labeled peptides as y i ’s for M/L and H/L ratios, respectively.
Identification and validation of triplex isotopic clusters
After identification of triplex isotopic clusters of a target peptide, we check them and discard the current MS scan if they are doubtful according to the following criteria. First, we check whether the overall shape of each isotopic cluster resembles that of a theoretical isotopic distribution. At least the LSQ of the most abundant isotopic cluster must be below a threshold (e.g. 0.2). The LSQ of the others should also be below the threshold unless their sums of intensities are lower than a half of that of the most abundant isotopic cluster. (If an isotopic cluster has low abundance, its shape could be abnormal because it may be interfered by chemical noise and other peptides.) Second, we check whether the identified isotopic cluster is overlapped with another peptide. Four types of overlaps are shown in Figure 3. There is no problem if no isotopic peak is shared by two isotopic clusters (Figure 3a). If an isotopic cluster with a different charge value is overlapped, the LSQ of the identified isotopic cluster should be significantly high, so we can discard the current MS scan (Figure 3b). If an isotopic cluster with the same charge and a higher mass is overlapped, shared isotopic peaks could not be inserted to the isotopic cluster of the target peptide because it increases the LSQ of the isotopic cluster (Figure 3c). Only the case in which an isotopic cluster with the same charge and a lower mass is overlapped needs additional filtering (Figure 3d). We can easily detect these overlaps by considering previous peaks, but we can’t separate overlapping isotopic clusters in this case because they look like one isotopic cluster. Therefore, we discard the current MS scan if at least one isotopic cluster of a target peptide could be identified as an isotopic cluster with the same charge and a lower mass.
Application to 7-standard protein data mixed with known ratios
Expected ratios and computed ratios for seven proteins in standard mixtures
(a) Set1 experiment
Number of MS/MS
(b) Set2 experiment
Number of MS/MS
Separation of overlapping triplex isotopic clusters
Cause of low abundance of heavy-labeled peptides
Std1 and Std3 were labeled with light and heavy mTRAQ labels, respectively, in Set1 Experiment and vice-versa in Set2 Experiment. The calculated H/L ratios were lower than the estimated values in both cases, which exclude the possibility of under-digestion of some of the standard mixtures compared to the others. If then, we would expect reversed H/L ratios between the two experimental sets. It becomes even more evident if we consider the MS/MS search results in which only one out of 168 validated peptides was identified as partially labeled.
Comparison between approximated H/L ratios and computed H/L ratios
The origin can also be explained, though in part, by isotope impurity of heavy label. Upon closer inspection of MS spectra of the identified peptides, a peak 1 Da smaller than the monoisotopic peak of heavy label was frequently found (Supplementary Figure S2 in Additional file 1). It was reported that iTRAQ reagents contain trace levels of isotopic impurities . Since mTRAQ shares the same chemical structure with iTRAQ, we expect that the same problem will happen in mTRAQ data analysis.
In real experiments where quantification of complex proteome is needed, one can add a known standard at the ratio of 1:1:1, and use the calculated ratio of the standard as a correction factor. For example, if the calculated ratio of LALBA in the current study is used as a correction factor, the ratios of other proteins become closer to the expected ratios.
We have developed a new data analysis algorithm for peptide quantification in triplex mTRAQ experiments. It can calculate the ratios of peptides accurately by separating overlapping triplex isotopic clusters based on the arithmetic models of isotope overlap and an automatic determination for the elution area of peptides. When used within the TPP pipeline, it can easily analyze high-throughput proteomics data.
This work was supported by 21C Frontier Functional Proteomics Project from Korean Ministry of Education, Science & Technology (FPR08-A1-020, FPR08-A1-021, FPR08-A1-090).
This article has been published as part of BMC Bioinformatics Volume 12 Supplement 1, 2011: Selected articles from the Ninth Asia Pacific Bioinformatics Conference (APBC 2011). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/12?issue=S1.
- Aebersold R, Mann M: Mass spectrometry-based proteomics. Nature 2003, 422(6928):198–207. 10.1038/nature01511View ArticlePubMedGoogle Scholar
- MacCoss MJ, Matthews DE: Quantitative MS for proteomics: teaching a new dog old tricks. Anal Chem 2005, 77(15):294A-302A. 10.1021/ac053431eView ArticlePubMedGoogle Scholar
- Mueller LN, Brusniak MY, Mani DR, Aebersold R: An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. Journal of Proteome Research 2008, 7(1):51–61. 10.1021/pr700758rView ArticlePubMedGoogle Scholar
- Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R: Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nature Biotechnology 1999, 17(10):994–999. 10.1038/13690View ArticlePubMedGoogle Scholar
- Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M: Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 2002, 1(5):376–386. 10.1074/mcp.M200025-MCP200View ArticlePubMedGoogle Scholar
- Yao XD, Freas A, Ramirez J, Demirev PA, Fenselau C: Proteolytic O-18 labeling for comparative proteomics: Model studies with two serotypes of adenovirus. Analytical Chemistry 2001, 73(13):2836–2842. 10.1021/ac001404cView ArticlePubMedGoogle Scholar
- Aggarwal K, Choe LH, Lee KH: Shotgun proteomics using the iTRAQ isobaric tags. Brief Funct Genomic Proteomic 2006, 5(2):112–120. 10.1093/bfgp/ell018View ArticlePubMedGoogle Scholar
- Kang UB, Yeom J, Kim H, Lee C: Quantitative Analysis of mTRAQ-Labeled Proteome Using Full MS Scans. Journal of Proteome Research 2010, 9(7):3750–3758. 10.1021/pr9011014View ArticlePubMedGoogle Scholar
- Han DK, Eng J, Zhou HL, Aebersold R: Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nature Biotechnology 2001, 19(10):946–951. 10.1038/nbt1001-946PubMed CentralView ArticlePubMedGoogle Scholar
- Li XJ, Zhang H, Ranish JA, Aebersold R: Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Analytical Chemistry 2003, 75(23):6648–6657. 10.1021/ac034633iView ArticlePubMedGoogle Scholar
- Shinkawa T, Taoka M, Yamauchi Y, Ichimura T, Kaji H, Takahashi N, Isobe T: STEM: A software tool for large-scale proteomic data analyses. Journal of Proteome Research 2005, 4(5):1826–1831. 10.1021/pr050167xView ArticlePubMedGoogle Scholar
- Halligan BD, Slyper RY, Twigger SN, Hicks W, Olivier M, Greene AS: ZoomQuant: An application for the quantitation of stable isotope labeled peptides. Journal of the American Society for Mass Spectrometry 2005, 16(3):302–306. 10.1016/j.jasms.2004.11.014PubMed CentralView ArticlePubMedGoogle Scholar
- Bellew M, Coram M, Fitzgibbon M, Igra M, Randolph T, Wang P, May D, Eng J, Fang RH, Lin CW, et al.: A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS. Bioinformatics 2006, 22(15):1902–1909. 10.1093/bioinformatics/btl276View ArticlePubMedGoogle Scholar
- Lin WT, Hung WN, Yian YH, Wu KP, Han CL, Chen YR, Chen YJ, Sung TY, Hsu WL: Multi-Q: a fully automated tool for multiplexed protein quantitation. J Proteome Res 2006, 5(9):2328–2338. 10.1021/pr060132cView ArticlePubMedGoogle Scholar
- Faca V, Coram M, Phanstiel D, Glukhova V, Zhang Q, Fitzgibbon M, McIntosh M, Hanash S: Quantitative analysis of acrylamide labeled serum proteins by LC-MS/MS. Journal of Proteome Research 2006, 5(8):2009–2018. 10.1021/pr060102+View ArticlePubMedGoogle Scholar
- Monroe ME, Tolic N, Jaitly N, Shaw JL, Adkins JN, Smith RD: VIPER: an advanced software package to support high-throughput LC-MS peptide identification. Bioinformatics 2007, 23(15):2021–2023. 10.1093/bioinformatics/btm281View ArticlePubMedGoogle Scholar
- Cox J, Mann M: MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology 2008, 26(12):1367–1372. 10.1038/nbt.1511View ArticlePubMedGoogle Scholar
- Park SK, Venable JD, Xu T, Yates JR: A quantitative analysis software tool for mass spectrometry-based proteomics. Nature Methods 2008, 5(4):319–322.PubMed CentralPubMedGoogle Scholar
- Dasari S, Wilmarth PA, Reddy AP, Robertson LJG, Nagalla SR, David LL: Quantification of Isotopically Overlapping Deamidated and O-18-Labeled Peptides Using Isotopic Envelope Mixture Modeling. Journal of Proteome Research 2009, 8(3):1263–1270. 10.1021/pr801054wPubMed CentralView ArticlePubMedGoogle Scholar
- DeSouza LV, Taylor AM, Li W, Minkoff MS, Romaschin AD, Colgan TJ, Siu KWM: Multiple reaction monitoring of mTRAQ-labeled peptides enables absolute quantification of endogenous levels of a potential cancer marker in cancerous and normal endometrial tissues. Journal of Proteome Research 2008, 7(8):3525–3534. 10.1021/pr800312mView ArticlePubMedGoogle Scholar
- Yoon JY, Lim KY, Lee S, Park K, Paek E, Kane UB, Yeom J, Lee C: Improved Quantitative Analysis of Mass Spectrometry using Quadratic Equations. Journal of Proteome Research 2010, 9(5):2775–2785. 10.1021/pr100183tView ArticlePubMedGoogle Scholar
- Perkins DN, Pappin DJC, Creasy DM, Cottrell JS: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20(18):3551–3567. 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2View ArticlePubMedGoogle Scholar
- Senko MW, Beu SC, Mclafferty FW: Determination of Monoisotopic Masses and Ion Populations for Large Biomolecules from Resolved Isotopic Distributions. J Am Soc Mass Spectr 1995, 6(4):229–233. 10.1016/1044-0305(95)00017-8View ArticleGoogle Scholar
- Rockwood AL, Haimi P: Efficient calculation of accurate masses of isotopic peaks. Journal of the American Society for Mass Spectrometry 2006, 17(3):415–419. 10.1016/j.jasms.2005.12.001View ArticlePubMedGoogle Scholar
- Ow SY, Salim M, Noirel J, Evans C, Rehman I, Wright PC: iTRAQ Underestimation in Simple and Complex Mixtures: "The Good, the Bad and the Ugly". Journal of Proteome Research 2009, 8(11):5347–5355. 10.1021/pr900634cView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.