Isotope pattern deconvolution for peptide mass spectrometry by nonnegative least squares/least absolute deviation template matching
 Martin Slawski^{1}Email author,
 Rene Hussong^{2, 3},
 Andreas Tholey^{4},
 Thomas Jakoby^{4},
 Barbara Gregorius^{2, 4},
 Andreas Hildebrandt^{2, 5} and
 Matthias Hein^{1}
DOI: 10.1186/1471210513291
© Slawski et al.; licensee BioMed Central Ltd. 2012
Received: 6 March 2012
Accepted: 27 October 2012
Published: 8 November 2012
Abstract
Background
The robust identification of isotope patterns originating from peptides being analyzed through mass spectrometry (MS) is often significantly hampered by noise artifacts and the interference of overlapping patterns arising e.g. from posttranslational modifications. As the classification of the recorded data points into either ‘noise’ or ‘signal’ lies at the very root of essentially every proteomic application, the quality of the automated processing of mass spectra can significantly influence the way the data might be interpreted within a given biological context.
Results
We propose nonnegative least squares/nonnegative least absolute deviation regression to fit a raw spectrum by templates imitating isotope patterns. In a carefully designed validation scheme, we show that the method exhibits excellent performance in pattern picking. It is demonstrated that the method is able to disentangle complicated overlaps of patterns.
Conclusions
We find that regularization is not necessary to prevent overfitting and that thresholding is an effective and userfriendly way to perform feature selection. The proposed method avoids problems inherent in regularizationbased approaches, comes with a set of wellinterpretable parameters whose default configuration is shown to generalize well without the need for finetuning, and is applicable to spectra of different platforms. The R package IPPD implements the method and is available from the Bioconductor platform (http://bioconductor.fhcrc.org/help/biocviews/devel/bioc/html/IPPD.html).
Background
Mass spectrometry (MS), often in conjunction with high performance liquid chromatography (HPLC), is the defacto standard analytical tool to derive important biological knowledge about the protein content of whole cells, organelles, or biomedical samples like tumour or blood plasma. Within a typical experimental setup, purified proteins of the sample under study are digested by an enzyme. Before entering the mass spectrometer, peptides are separated chromatographically according to their physicochemical properties in order to avoid a massive overlapping of peptide signals within a single scan. Nevertheless, due to the sheer number of peptides present in a sample, interfering patterns still occur frequently, not least because of posttranslational modifications such as the deamidation of asparagines or glutamine residues. In order to obtain an unambiguous assignment of the signals, and in particular their isotope patterns, which is a prerequisite for a proper identification and quantification, every data point in m/zdimension is classified either as ‘signal’ or as ‘noise’ during the socalled feature detection phase. As this processing lies at the very root of every proteomic application, the quality of feature detection can have dramatic impact on the finally derived results and conclusions. In view of the large amount of data even a single MS experiment can produce, automated analysis is indispensable. However, due to various artifacts arising from electric and chemical noise and baseline trends, the identification of isotope patterns is errorprone and time consuming. In addition, severe overlaps of peptide signals within the same mass spectrometric scan can hamper a straightforward analysis furthermore. In recent years, numerous procedures have been developed to process this data (cf., e.g., [1–8]). Within this paper, we propose a novel method that is demonstrated to perform especially well in challenging situations, characterized e.g. by strong local variations in noise and intensity levels or the presence of isotope patterns of different charges exhibiting overlap, which in many cases may be difficult to resolve even for a human expert by visual inspection. Existing software typically depends on a large set of parameters requiring careful finetuning, often being rather sensitive to changes in the measurement process like the change of the platform, which makes a proper parameter choice a laboursome task. In contrast, the proposed method has been designed to depend on a comparatively small set of wellinterpretable parameters whose default configuration is shown to be robust, yielding mostly excellent, but at least competitive performance on spectra of different platforms. In a nutshell, our method uses nonnegative least squares or nonnegative least absolute deviation regression to fit a spectrum s by a large dictionary of templates mimicking isotope patterns; since true positions and charges of isotope patterns in the spectrum are unknown in advance, regions where the signal exceeds a local measure of noise are identified and then a vast set of templates is placed in those regions. In the spirit of sparse recovery, a small subset of the templates, which reasonably explains the observed signal, is selected by applying hard thresholding with a locally adaptive choice of the threshold to the regression coefficients obtained previously. Our method is related to a formerly proposed templatebased approach (NITPICK, [3]). As opposed to the present work, NITPICK uses ℓ_{1}regularized nonnegative least squares. Without nonnegativity constraints, this procedure is known as the lasso [9]. Reference [10] contains the first application of the lasso to the problem studied in the present paper. Given a dramatic increase in occurrence of highdimensional datasets in recent years and the resulting need for feature selection, the lasso, due to computationally and theoretically appealing properties, has meanwhile become so popular that it can be regarded as a standard tool of modern data analysis [11]. In this respect, NITPICK follows the usual paradigm suggesting that ℓ_{1}regularization is the method of choice. In the present paper, we argue for a deviation from that paradigm mainly in view of the following two aspects. First, a major benefit of our fitting+thresholding approach is that parameter choice is more userfriendly, since the threshold can be interpreted in terms of a signaltonoise ratio. This is unlike the regularization parameter of the lasso, which can in general not be related directly to the signal. In the presence of heterogeneous noise and model misspecifications, the ‘right amount’ of regularization is notoriously difficult to choose. Second, there is a substantial body of work showing that nonnegativity constraints alone may suffice to recover a sparse target. Nonnegative least squares + thresholding is analyzed in [12], where it is shown that it can significantly outperform the usual ℓ_{1}approach with respect to sparse recovery. See Section “Sparse recovery with nonnegativity constraints: nonnegative least squares + thresholding vs. the nonnegative lasso” for a detailed discussion.
Methods
A spectrum is understood as a sequence of pairs ${\left\{\right({x}_{i},{y}_{i}\left)\right\}}_{i=1}^{n}$, where x_{ i }=m_{ i }/z_{ i } is a mass (m_{ i }, measured in Dalton Da) to charge (z_{ i }), and y_{ i }is the intensity, i.e. the abundance of a particular mass (modulo charge state), observed at x_{ i }, i=1,…,n, which are assumed to be ordered increasingly.
Template model
Parameter estimation
The parameters θ_{c,j}=(α_{c,j}σ_{c,j}μ_{c,j})^{⊤}of the peaks (3) are unknown in practice. Following a central paradigm of our framework, which is to relieve the user of performing laboursome finetuning of parameters, we have developed a systematic procedure automatically providing estimates of these parameters, which is considerably more efficient and flexible than a grid search. For instance, the parameters may additionally depend on the m/zposition. Our framework for parameter estimation extends a conceptually similar approach in [18] designed for a Gaussian peak shape.
We refrain from using least squares regression to determine the parameters in (5) due to its sensitivity to possible outliers, which arise from poorly resolved, wiggly or overlapping isotope patterns, which may affect the quality of the estimates ${\hat{\ud703}}_{r}$. Therefore, the linear model is fitted in a robust way by using least absolute deviation regression. Given the resulting estimates of the parameters {ν_{l,m}}, m/zspecific estimates for the parameters in (3) are obtained by evaluating (5).
Template fitting
The optimization problem (7) is a quadratic (q=2) or linear (q=1) program and is solved using interior point methods (e.g. [20]). The details are relegated to Appendix “Fitting with nonnegativity constraints” section. As far as the choice of q is concerned, we point out that q=1 yields a robust fit that can deal better with deviations from model assumptions, i.e. deviations from the averagine model or from the peak model. However, in general, we are unable to provide any recommendation about how to choose q. Therefore, in our validation, both are evaluated.
Comparison with pepex
In prior work [21], subsequently referred to as ‘pepex’, nonnegative least squares fitting is used as well. An important difference to our approach is that the matrix Φ is not constructed from the convolution of isotope distributions and peak shapes as described in Section “Template model”. Instead, peak detection is applied first to reduce the raw intensity data to peak clusters, a step that is usually referred to as centroiding. At the second stage, called deisotoping, peak clusters are fitted by a design matrix containing isotope distributions themselves, not convolved versions. While the approach is computationally more attractive and avoids estimation of peak shape parameters (cf. Section “Parameter estimation”), the division into centroiding and deisotoping may lead to poor performance for low resolution and noisy data, or in the presence of overlapping patterns. In these cases, peak detection is little reliable. In our templatebased approach, there is no separation of centroiding and deisotoping. It performs much better in the aforementioned cases, since it operates directly on the data and is hence less affected if single peaks of a pattern are difficult to detect. This reasoning is supported by our evaluation in Section “Results and discussion” as well as that in [3]. At the same time, our approach can in principle be applied to centroided spectra as well. In this case, the columns of the matrix Φ directly represent isotope distributions instead of isotopic patterns.
Postprocessing and thresholding
 1.
Separately for each c, divide the sets ${\hat{\mathcal{\mathcal{M}}}}_{c}$ into groups ${\mathcal{G}}_{c,1},\dots ,{\mathcal{G}}_{c,{G}_{c}}$ of ‘adjacent’ positions. Positions are said to be adjacent if their distance on the m/z scale is below a certain tolerance ppm specified in partspermillion, cf. Section “Finding a set of default parameters”. In the context of ‘peak splitting’, the templates at locations sharing the same group are assumed to fit precisely one true underlying peak.
 2.With the notation of Eq. (2), for each c=1,…,C, and for g=1,…,G _{ c }, we solve the following optimization problem.$\begin{array}{c}({\stackrel{~}{m}}_{c,g},{\stackrel{~}{\beta}}_{c,g})=\underset{{\beta}_{c,g}}{\underset{{m}_{\text{c,g}}}{\text{argmin}}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\underset{\infty}{\overset{\infty}{\int}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{\left(\sum _{{m}_{c,j}\in {\mathcal{G}}_{c,g}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{\hat{\beta}}_{c,j}{\psi}_{{m}_{c,j}}\left(x\right){\beta}_{c,g}{\psi}_{{m}_{c,g}}\left(x\right)\right)}^{2}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\mathit{dx},\end{array}$(8)
 3.
One ends up with sets ${\stackrel{~}{\mathcal{\mathcal{M}}}}_{c}={\left\{{\stackrel{~}{m}}_{c,g}\right\}}_{g=1}^{{G}_{c}}$ and coefficients ${\left\{{\stackrel{~}{\beta}}_{c,g}\right\}}_{g=1}^{{G}_{c}},\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}c=1,\dots ,C$.
The additional benefit of solving (8) in step two as compared to the selection of the template with the largest coefficient within each group as proposed in [3] is that we are able to determine the location of the pattern even more accurately as predetermined by a limited sampling rate, since in (8) we optimize the location over a continuum. The optimization problem (8) can be solved fast and accurately by sampling the integrand on a fine grid of points and then solving a nonlinear least squares problem with optimization variables m_{c,g} and β_{c,g}.
The idea underlying this procedure is that in noise regions, the fit to the data will be poor, and consequently, the size of the residuals is expected to be large relative to the signal, hence leading to a low goodnessoffit statistic. The truncation at 0.5 limits the influence of this correction. A final list is generated by checking whether the signaltonoise ratios (9) exceed a ‘significance threshold’ t specified by the user. We do not give a general guideline for choosing t, because a reasonable choice is very specific to experimental conditions, e.g. the platform used and the composition of the spectrum. It is important to note that while t itself is constant, we take into account that the noise level is heterogeneous, since thresholding is based on the ratios (9), where the local noise level enters.
Finding a set of default parameters
Apart from the signaltonoise threshold t, we have introduced the parameters window, i.e. the width h of the sliding window required for the computation of the local noise level (6), the template placement parameter factor.place and the partspermillion tolerance ppm within which peaks are considered to be merged by the postprocessing procedure. With the exception of the threshold t, we have fixed all parameters to a default setting which we expect to give reasonable (albeit potentially suboptimal) results on spectra different from the ones analyzed here, without the need of manual tuning. In order to find such a default setting, we performed a grid search using only one selected spectrum of those described in Section “Datasets” below. While our default setting, which can be found in the HTML manual of the R package IPPD, already performs well, we recommend to do such a calibration to optimize the performance of our method.
Sparse recovery with nonnegativity constraints: nonnegative least squares + thresholding vs. the nonnegative lasso
We believe that our preference for the first alternative is a major methodological contribution that has potential to impact related problems where nonnegativity problems come into play. In the present section, we provide, at a high level, a series of arguments rooting in the statistics and signal processing literature that clarify our contribution and support our preference.
Linear models and usual paradigms in statistics
which corresponds to model (1), where ‘≈’ is used instead of ‘=’ to account for stochastic noise or model misspecifications. Linear models of the form (10) have been and continue to be objects of central interest in statistical modelling.

Classical work in statistics shows that under mild conditions if the number of sample n grows at a faster rate than the number of features p, the ordinary least squares estimator ${\hat{\beta}}^{\text{ols}}\to {\beta}^{\ast}$ (in probability) as n→∞.

Since many contemporary datasets, like the MS datasets of the present paper, are characterized by a large p, which is of the same order as n or even larger, the first bullet has considerably lost relevance. Translated to MS datasets, it provides a statement about the case where the resolution tends to infinity. Therefore, modern statistical theory studies regimes in which p is allowed to grow at a faster rate than n, with a focus on results that hold for finite sample sizes. These results hinge on some sort of sparsity assumption on β^{∗}, the simplest being that β^{∗}is zero except for some index set (support) of small cardinality. In this context, a multitude of results has been proved (see e.g. [23] for an overview) indicating that the lasso estimate ${\hat{\beta}}^{\text{lasso}}$ is a statistically optimal procedure in the sense that if the regularization parameter is chosen in the right way, the squared Euclidean distance $\parallel {\hat{\beta}}^{\text{lasso}}{\beta}^{\ast}{\parallel}_{2}^{2}$ is nearly of the same order as that of an estimator one could construct if the nonzeroes of β^{∗}were known.
The second bullet provides quite some justification for NITPICK, which is based on the lasso. However, as detailed below, the italicized part can be critical. On the other hand, there are several results that support our approach.
The power of nonnegativity constraints

It turns out that the nonnegativity constraint β≥0imposed in nonnegative least squares (NNLS) may lead to a drastically better performance than that of the ordinary least squares estimator in ‘large p’ situations provided Φsatisfies additional conditions. Roughly speaking, it is shown in [12] that if Φhas nonnegative entries, which is fulfilled for the template matching problem of Section “Template model”, the NNLS estimator $\hat{\beta}$ does not overfit and is unique even in the singular case (p>n). These results indicate that NNLS may behave surprisingly well in a highdimensional setup, without using ℓ_{1}regularization, which is often propagated in the literature as basically the only option ([24], Section 16.2.2).

There are several recent papers [25–27] in the sparse recovery literature in which it is shown that a sparse, nonnegative vector can be recovered from few linear measurements n≪p. In [12], these results are extended to a noisy setup. More specifically, it is shown that NNLS + thresholding can consistently recover β^{∗}and its support. Very recently, using similar conditions as in [12], Meinshausen [28] has established several guarantees of NNLS in a highdimensional setup.
One should bear in mind that the nonnegativity constraints are essential for our approach. Thresholding the unconstrained ordinary least squares estimator ${\hat{\beta}}^{\text{ols}}$ in general leads to poor results in the ‘large p’ situation.
Shortcomings of ℓ_{1}regularization in theory
In [12], it is not only shown that NNLS + thresholding is a sound strategy to perform sparse recovery of a nonnegative target, but also examples are given where the nonnegative lasso is outperformed even if its regularization parameter is set to match theoretical results and regardless of whether subsequent thresholding as advocated in [29, 30] is used or not. In particular, inferiority of the lasso arises in the presence of small, yet significantly nonzero entries in β^{∗}. These are specifically affected by the nonnegligible bias of ℓ_{1}regularization [31]. It is important to note that the comparison in [12] does not contradict prior comparisons of the lasso (aka soft thresholding) and (hard) thresholding for orthonormal designs (Φ^{⊤}Φ=I) in [32, 33], where both approaches perform similarly well and nonnegativity constraints are not particularly important. Orthonormal designs, which lead to greatly simplified estimation problem are not of interest in the context of the paper, since the template matrix Φis far from being orthonormal.
Shortcomings of ℓ_{1}regularization in practice
The study in [12] is of more theoretical nature, since all constants of the problem, in particular the noise level, are known, so that the regularization parameter can be set in an optimal fashion. This can realistically not be accomplished in practice. Likewise, the informationtheoretic criterion employed in [3] as well as the datasplitting approach of [34] rely on knowledge of the noise level, or a consistent estimate thereof, which is hard to obtain in the ‘large p’ situation [35]. In any case, the regularization parameter remains a quantity that is hard to grasp and hence hard to set for a practitioner, since it cannot be related directly to the signal. In contrast, the threshold t admits a straightforward interpretation.
Moreover, when using ℓ_{1}regularization, data fitting and model selection are coupled. While this is often regarded as advantage, since model selection is performed automatically, we think that it is preferable to have a clear separation between data fitting and model selection, which is a feature of our approach. Prior to thresholding, the output of our fitting approach gives rise to a ranking which we obtain without the necessity to specify any parameter. Selection is completely based on a single fit simply by letting the the threshold vary. On the contrary, if one wants to reduce the number of features selected by the lasso, one resets the regularization parameter and solves a new optimization problem. Note that it is in general not possible to compute the entire solution path of the lasso [22] for the MS datasets used for the present paper, where the dimension of Φ is in the ten thousands so that the active set algorithm of [22] is prohibitively slow. In this regard, model selection by thresholding is computationally more attractive.
Results and discussion
For the assessment of the pattern picking performance, in total eight spectra generated by two different ionization methods, matrix assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI), respectively, form the basis of the evaluation. While MALDI has been coupled to a timeofflight (TOF) mass analyzer, ESI MS spectra have been recorded on both a linear ion trap (LTQ) and an Orbitrap mass analyzer. In addition, a series of spectra were prepared with the aim of investigating in detail the method’s performance in the presence of overlapping peptides.
Datasets
For MALDI mass spectra (Additional file 1), time of flight mass analysis was performed; spectra were recorded on an ABI MALDITOF/TOF 4800 instrument in positive ion mode using αcyano4hydroxycinnamic acid (CHCA) as matrix. Nanospray ESI spectra (Additional file 2) were measured in positive ion mode on a Thermo LTQ Orbitrap Velos MS; both high resolution measurements using the Orbitrap mass analyzer (referred to as ‘Orbitrap’) and, alternatively, low resolution linear ion trap (IT) measurements were performed with this setup. This experiment has been chosen in order to demonstrate the utility of our method at different concentration levels, that it is robust with respect to changes in the datagenerating process and that the method is capable of handling singly charged ions, the main form generated by MALDI MS, as well as higher charged ions formed in ESI MS. Tryptic digests (performed in 40 mM ammonium bicarbonate) of model proteins were used as analytes: bovine myoglobin and chicken egg lysozyme (10 and 500 fmol each) for MALDITOF experiments, and lysozyme (250 and 1000 fmol) for ESI experiments. Disulfide bonds were reduced with dithiothreitol (DTT) prior to alkylation, free cysteine residues were alkylated by iodacetamide. No further sample pretreatment was performed prior to MS analysis. When referring to these spectra, we omit that tryptic digests are given: e.g., the term ‘MALDITOF myoglobin spectrum (500 fmol)’ means the respective tryptic digest.
To demonstrate explicitly the method’s ability to separate strongly overlapping patterns even in the case of badly resolved signals, 22 additional spectra have been generated in positive ion mode on a Bruker Daltonics HCT Ultra Ion Trap MS with an electrospray ion source. Three synthetic peptides (cf. Section “Unmixing of overlaps” for details) with sequences corresponding to tryptic peptides from bovine serum albumin (BSA) were used as analytes. In each measurement two out of three peptides were mixed in different ratios to get overlapping peptide signals, also with different charge states. Two different concentrations (500 fmol/μ l and 1000 fmol/μ l) were injected into the mass spectrometer via a ColeParmer syringe pump.
Validation strategy
Validation of pattern picking is notoriously difficult, because a gold standard which is satisfactory from both statistical and biological points of view is missing. In this context, a major problem one has to account for is that spectra frequently contain patterns whose shape is not distinguishable from those of peptides, but which are in fact various artifacts resulting e.g. from impurities during sample preparation and measurement. These artifacts do not constitute biologically relevant information and are, in this sense, ‘false positives’. An important instance are signals derived from the matrix (or from matrixclusters) frequently observed in MALDI MS. The pattern of these signals is similar to that of peptides; nevertheless, due to their molecular composition, which differs significantly from that of an average peptide, the exact masses can be used to exclude these signals from the data analysis. On the other hand, from a statistical perspective which judges a method according to how well it is able to detect specific patterns in a given dataset, a qualification as ‘true positive’ is justified. With the aim to unify these aspects, we have worked out a dual validation scheme. In order to reduce the number of artifacts, all automatically generated lists of candidates for peptide masses as well as the lists of a human expert (see below) are postprocessed by a peptide mass filter [36]: only peptides whose monoisotopic mass deviated less than 200 ppm from the closest peptide mass center^{a} are used for subsequent evaluation.
Comparison with manual annotation
The first part investigates how well a method is able to support a human expert who annotates the spectra manually. More specifically, the automatically generated lists are matched to the manual annotation such that an entry of the list (potential peptide mass) is declared ‘true positive’ whenever there is a corresponding mass in the manual annotation deviating by no more than Δ ppm. Otherwise, it is declared ‘false positive’. In order to adapt Δppm to the resolution of the different mass lines, we used the following strategy: assuming that most of the peptides will have a mass larger than 700 Da, we determined the spacing Δ_{m/z} between neighboring data points in m/zdirection for each mass spectrum in the lower mass range. If we further assume that a simple manual annotation by visual inspection can result in a mass deviation from the ‘correct’ mass position of at most Δ_{m/z}, we can derive the following tolerance values: Δ=100 ppm for ion trap recordings, Δ=50 ppm in the case of MALDITOF recordings^{b} and Δ=20ppm for Orbitrap data.
As the performance of our as well as those of all competing methods depends on a thresholdlike parameter governing, crudely speaking, the tradeoff between precision and recall, we explore the performance for a range of reasonable parameter values, instead of fixing an (arbitrary) value, which we believe to be little meaningful. The results are then visualized as ROC curve, in which each point in the (Recall, Precision)plane corresponds to a specific choice of the parameter. Formally, we introduce binary variables {B_{ i }(t)} for each mass i contained in the list of cardinality $\hat{L}\left(t\right)$ when setting the threshold equal to t, where B_{ i }(t)equals 1 if the mass is matched and 0 otherwise, and denote by L the number of masses of the manual annotation. The true positive rate (recall, R), and the positive predictive value (precision, P) associated with threshold t are then defined by $R\left(t\right)=\frac{\sum _{i}{B}_{i}\left(t\right)}{L}$, $P\left(t\right)=\frac{\sum _{i}{B}_{i}\left(t\right)}{\hat{L}\left(t\right)}$. An ROC curve results from a sequence of pairs {R(t),P(t)}for varying t.
Database query
The second part evaluates the lists in terms of a query to the Mascot search engine [37], version 2.2.04. In particular, we account for a major problem of a manual annotation, namely that peptides yielding weak MS signals might easily be overlooked, but might be detected by methods designed to extract those weak signals. Since we are especially interested in demonstrating the method’s ability to separate overlapping patterns, we adapted the standard search parameters of Mascot’s peptide mass fingerprint routine to allow two missed cleavage sites and to incorporate the following (variable) posttranslational modifications: ‘Oxidation (M)’, ‘Carbamidomethyl (C)’, ‘Amidated (Protein Cterm)’, ‘Deamidated (NQ)’. In particular, the latter two modifications will frequently trigger MS signals interleaving with the pattern of their unmodified counterpart: in the case of a deamidation the modified ion shows a mass of approx. 0.98 Da more compared to the amidated peptide. The same mass tolerances as for the manual annotation are used. As for the comparison with the manual annotation, we evaluate several lists corresponding to different choices of the threshold. Instead of an ROC curve, which turned out to be visually unpleasant, we display the statistics (score, coverage and fraction of hits) of two lists per method, namely of those achieving the best score and the best coverage, respectively. The complete set of results as well as further details of our evaluation like the manual annotation are contained in Additional file 3.
Competing methods
We compare our method in its two variants depending on the choice of the fitting criterion (cf. Eq. (7)), labelled l_{1}(q=1) and l_{2}(q=2), respectively, with the following competing methods.
Lasso
where W is a diagonal matrix with entries w_{c,j}=LNL_{+} (m_{c,j}), j=1,…,p_{ c }, c=1,…,C, whose purpose is to rescale the amount of ℓ_{1}regularization according to the local noise level. The columns of the template matrix Φ in (8)). The parameter λhere plays the role of the threshold t, cf. Section “Validation strategy”.
Pepex
As discussed in Section “Template fitting”, pepex performs centroiding and deisotoping separately. Deisotoping is based on nonnegative least squares. Since pepex is limited to detect patterns of charge state one, its performance is only assessed for MALDITOF spectra. Accordingly, when comparing the ouptput of pepex with the manual annotation, the few patterns of charge state two are excluded. The parameters nm, pft, mincd, maxcd and nsam were set to optimize performance with respect to manual annotation. The ROC curves are based on peaklists resulting from ten different choices of the signaltonoise parameter snr.
Isotope wavelet
As opposed to our method, this approach is not able to handle overlaps. On the other hand, it typically shows strong performance in noisy and low intensity regions or on datasets with extremely low concentrations [39, 40]. While the isotope wavelet is not limited to charge one, it is run in charge one only mode for the MALDITOF spectra, to achieve more competitive performance. For the sake of fair of comparison, the result of the isotope wavelet on the MALDITOF spectra are evaluated in the same way as those of pepex.
Vendor
The parameter setting for the ABI MALDITOF/TOF MS software was as follows: Local Noise Width (m/z) 250, Min Peak Width at FWHM 2.9. The Cluster Area Optimization S/N threshold has been dynamically adapted to about three times the S/N threshold as suggested by the ABI documentation. Since the vendor software is limited to charge one, its outputs are evaluated in the same way as those of pepex. Given the disproportionally high effort needed to find an optimal parameter setting of the vendor software for ESI spectra, its performance is not assessed.
Results
Manual annotation vs. database query
Mascot results
MALDI Myo 500 fmol  score  cvrg  hits  score  cvrg  hits 

l _{1}  211.0  0.85  0.94  96.8  0.96  0.04 
l _{2}  211.0  0.85  0.94  49.6  0.96  0.04 
lasso  207.0  0.85  1.00  142.0  0.91  0.37 
pepex  223.0  0.85  1.00  142.0  0.90  0.17 
vendor  223.0  0.85  0.94  174.0  0.90  0.29 
wavelet  207.0  0.85  1.00  156.0  0.90  0.14 
MALDI Lys 500 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  167.0  0.81  0.57  133.0  0.83  0.37 
l _{2}  168.0  0.80  0.64  144.0  0.83  0.34 
lasso  151.0  0.64  0.77  112.0  0.83  0.37 
pepex  172.0  0.80  0.63  135.0  0.83  0.25 
vendor  146.0  0.64  0.75  91.4  0.83  0.20 
wavelet  127.0  0.58  0.75  113.0  0.81  0.20 
MALDI Myo 10 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  211.0  0.85  0.94  82.2  0.95  0.04 
l _{2}  207.0  0.74  1.00  109.0  0.90  0.14 
lasso  195.0  0.77  0.87  146.0  0.85  0.46 
pepex  97.8  0.80  0.22  97.8  0.80  0.22 
vendor  123.0  0.62  0.62  123.0  0.62  0.62 
wavelet  131.0  0.85  0.13  131.0  0.85  0.13 
MALDI Lys 10 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  89.0  0.35  1.00  73.7  0.54  0.23 
l _{2}  89.0  0.35  1.00  35.4  0.72  0.09 
lasso  81.9  0.46  0.70  46.0  0.74  0.10 
pepex  47.1  0.17  1.00  31.2  0.53  0.12 
vendor  62.7  0.23  1.00  43.2  0.34  0.16 
wavelet  55.4  0.23  0.45  43.8  0.82  0.10 
Orbi Lys 1000 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  149.0  0.70  0.78  138.0  0.80  0.53 
l _{2}  139.0  0.80  0.50  139.0  0.80  0.50 
lasso  159.0  0.63  0.87  120.0  0.81  0.29 
wavelet  105.0  0.69  0.44  95.1  0.80  0.23 
IT Lys 1000 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  78.7  0.63  0.28  70.9  0.74  0.17 
l _{2}  82.1  0.72  0.36  35.4  0.85  0.13 
lasso  103.0  0.84  0.33  76.8  0.99  0.21 
wavelet  107.0  0.79  0.63  69.8  0.99  0.11 
Orbi Lys 250 fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  107.0  0.63  0.50  100.0  0.80  0.31 
l _{2}  103.0  0.63  0.52  66.9  0.81  0.14 
lasso  108.0  0.63  0.77  107.0  0.80  0.27 
wavelet  80.6  0.70  0.22  80.6  0.70  0.22 
IT Lys 250fmol  score  cvrg  hits  score  cvrg  hits 
l _{1}  59.4  0.46  0.16  59.4  0.46  0.16 
l _{2}  37.0  0.59  0.14  37.0  0.59  0.14 
lasso  66.3  0.84  0.20  66.3  0.84  0.20 
wavelet  56.3  0.59  0.36  21.3  0.75  0.12 
Comparison
Figure 4 and Table 1 reveal an excellent performance of our methods (l_{1} and l_{2}) throughout all MALDITOF spectra under consideration. For the myoglobin spectra high sequence coverages are attained that clearly stand above those of competing methods. For the spectra at 10 fmol, only the performance of lasso is competetive with that of our methods in terms of the Mascot score; all other competitors, including the vendor software which has been tailored to process these spectra, are significantly weaker. In particular, the strikingly high proportion of ‘hits’ (≥94%) indicates that even at moderate concentration levels, our methods still distinguish well between signal and noise. This observation is strongly supported by the ROC curves in Figure 4, where the precision drops comparatively slowly with increasing recall. In this regard, our methodology clearly contrasts with approaches like the isotope wavelet that aim at achieving high protein sequence coverage. The latter often requires the selection of extremely lowly abundant peptide signals hidden in noise at the expense of reduced specificity.
For MALDITOF spectra at high concentration levels, pepex achieves the best scores and is competitive with respect to sequence coverage. However, the performance of pepex degrades dramatically at lower concentration levels, as it is unambiguously shown by both parts of the evaluation. In particular, the database scores are the worst among all methods compared. This provides some support for our reasoning at the end of Section “Template fitting”.
For the ESI spectra, our methods in total fall a bit short of the lasso (particularly for the ion trap spectra), but perform convincingly as well, thereby demonstrating that they can deal well with multiple charge states. This is an important finding, since the presence of multiple charges makes the sparse recovery problem as formulated in model (1) much more challenging, because the number of parameters to be estimated as well as the correlations across templates are increased. In spite of these difficulties, Figure 5 and Table 1 suggest that the performance of our pure fitting approach (7) does not appear to be affected. Using a more difficult set of spectra, the capability to process ESI data with impressive success is additionally shown in the next section.
Additional remarks

In Figure 4, the area under the curve (AUC) of our methods attained for myoglobin is higher for lower concentration. At first glance, this may seem contradictory since an increase in concentration should lead to a simplified problem. However, a direct comparison of the AUCs is problematic, since the number of true positives (17 at 10 fmol, 106 at 500 fmol) is rather different. For instance, there are choices of the threshold that yield 18 true positives and not a single false positive for both of our methods at 500 fmol, yet the AUC is lower.

The fact that some of the ROCs start in the lower left corner results from outputs containing only false positives.
Unmixing of overlaps
Motivation
One of the main advantages of our method over more simplistic pattern picking methods is the ability to disentangle isotope patterns of overlapping peptide signals, whose presence may lead to a significantly more challening pattern picking problem as e.g. discussed in [41] in the slightly different context of intact protein mass spectra. Therefore, a potential application for our approach will be the analysis of a certain class of posttranslational modifications, the deamidation of amino acid residues containing a carboxamide side chain functionality. The deamidation of asparagine (Asn) or glutamine (Gln) residues, yielding aspartic acid (Asp) or glutamic acid (Glu) residues, respectively, is an important posttranslational modification, which can have immense effects on the structure of peptides [42] and is of great relevance in a number of pathophysiological events [43]. During the deamidation, the side chain carboxamide is hydrolysed, which is accompanied by a mass increase of 0.98 Da. Thus, in a spectrum of a mixture of the amidated and deamidated form, a direct overlap of both signals can be observed. It has to be noted that additionally to the amidated/deamidated forms, in case of Asn deamidation, a second product containing an isopeptide bond is formed, too, which has the same molecular behaviour; these two forms can be identified solely by their differential MS/MS behavior.
Results
Peptides mixed together
sequence  sequence residueno. in BSA  monoisotopic mass(protonated) / charge 

GACLLPK  198204  351.20437 / +2 
CCTKPESER  460468  351.48816 / +3 
VLASSAR  212218  352.20850 / +2 
Unmixing of overlaps
peptides  351.2(2)/352.2(2)  351.4(3)/352.2(2)  351.2(2)/351.4(3)  

proportion  1:1  1:5  5:1  1:1  1:5  5:1  10:1  1:1  1:5  5:1  1:10  
fmol  
500  x  x  −  −  x  −  x  −  −  −  x  
all  1000  x  x  x  x  x  x  x  x  x  −  − 
500  −  x  −  −  x  −  −  −  −  −  −  
default  1000  x  x  −  x  −  x  x  −  x  −  − 
Conclusion
We have proposed a template matching approach for feature extraction in proteomic mass spectra. The main methodological innovation is a framework for sparse recovery in which sparsity is not promoted explicitly by a regularization term, as it is usually done and was done in previous work. We fully exploit the strength of nonnegativity constraints, which permits us to circumvent the delicate choice of a ‘proper’ amount of regularization, an everlasting problem in statistics, and to work with thresholding instead. The latter is not only computationally attractive, because one does not have to repeatedly solve the same optimization problem for different choices of the regularization parameter, but also increases userfriendliness, since the threshold is directly related to the signaltonoise ratio, the quantity domain experts are interested in. The replacement of a regularization parameter by a threshold is a cornerstone in our conceptual design guided by the principle to relieve the user from laboursome fine tuning of parameters. We believe that a small set of wellinterpretable parameters with suitable defaults additionally improves robustness and reproducibility of results. In this context, we would like to emphasize again that apart from the threshold, the user does not have to specify any parameters before running our software.
In a comprehensive experimental study involving instruments of varying resolution and spectra of varying concentration levels, where we comparatively assess the performance of our approach on the basis of an elaborate dual validation scheme, it is demonstrated that the performance for pattern picking is excellent for MALDITOF spectra and outstands due to its specificity in selecting signal and only little noise. A major strength of the method is its ability to unmix overlapping peptide signals as shown for a series of ESI spectra. In total, we demonstrate that our approach is broadly applicable to a variety of spectra. While our approach is guided by a concrete application in proteomics, the framework is general enough to be of much of use for related deconvolution problems emerging in other fields − only the templates have to be adjusted according to the specific application.
While in this paper, we have focused on single spectra, the approach can be extended to process whole LCMS runs, as it has already been implemented in our R package IPPD. More precisely, the sweep line scheme of [44] is used to agglomerate the results from single spectra. To apply our methods on a routine basis, an improved implementation, notably parallelization, is required, since e.g. processing a single spectrum of the Maxquant datasets [2] takes 10s on average on a Unix system equipped with an Intel(R) Core(TM)2 Duo CPU T9400 (2.53GHz) and 4 GB main memory. There is much room for an improvement, since our implementation is based on interpreted R code.
Concerning future directions of research, a question we have not yet answered in a satisfactory way is the choice of the fitting criterion. While both criteria (least squares and least absolute deviation) employed in this paper perform well, their implicit assumption of additive noise might be questionable [45]. It is worth investigating whether a multiplicative noise model could even yield better results. Second, one might ask whether the performance could be further improved when it is used jointly with the isotope wavelet, which is affected by overlaps, but has the potential to achieve higher protein sequence coverage.
Endnotes
^{a}Monoisotopic peptide mass centers are modelled by: 1.000485·m_{ n } + 0.029, where m_{ n } denotes the nominal mass.^{b}For the MALDITOF lysozyme datasets an extended search tolerance of 100ppm was applied due to experimental miscalibration of the MS.
Appendix
Fitting with nonnegativity constraints
In the following, we provide the details concerning optimization problem (7). In view of the special structure of Φ, (7) is computationally tractable even if n and the number of templates are in the ten thousands. We exploit the sparsity of the problem arising from templates which are highly localized, i.e. the domain on which they are numerically different from zero covers only a small part of the whole m/z range of the spectrum. As a consequence both Φand the Gram matrix Φ^{⊤}Φ, which is crucial in the computation, can conveniently be handled by using software for sparse matrices. For R, such software is available in the Matrix package [46].
Nonnegative least squares
Solution of linear systems of this structure constitutes the main computational effort to be made. Fast solutions are obtained by using CHOLMOD[47], which offers an efficient implementation for computing the Cholesky factorization of sparse symmetric, positive definite matrices. Since the diagonal of ${\nabla}_{\beta}^{2}$ changes from one Newton iteration to the next, one Cholesky factorization has to be performed per Newton step. Once we have solved (14) for a specific γ, we solve a new problem of the type (14) for γ·M, M>1. This is repeated until γexceeds a predefined maximum value. For a thorough account on the logbarrier method, we refer to [20].
Complexity analysis of nonnegative least squares
We here provide the order of magnitude of floating points operations (flops) required per update (i.e. per Newton step) for the specific nonnegative least squares problems considered for this paper. In our implementation, we exploit that the templates contained in the matrix Φare highly localized. As a result, after a suitable column permutation, the matrix Φ^{⊤}Φ is roughly a band matrix with bandwidth k no larger than only few hundreds. The dominant operation is solving the linear system ${\nabla}_{\beta}^{2}{\mathit{d}}_{\beta}={\nabla}_{\beta}$ with the help of the Cholesky factorization, which can be done in O(p k^{2})flops (e.g. [20], p.670). Our algorithm terminates after usually no more than one hundred Newton steps.
Nonnegative least absolute deviation
In order to solve the linear system, we proceed as for nonnegative least squares. The computational cost of this operation is roughly the same, since the sparse structure of Φ^{⊤}Φcan still be exploited. For nonnegative least squares, recomputation of the Hessian ${\nabla}_{\beta}^{2}$ only involves a diagonal update, an operation of negligible computational cost. However, for nonnegative least absolute deviation, computation ${\nabla}_{\beta}^{2}$ involves the matrix multiplication (Φ^{⊤}([Ξ^{+} ]^{−2} + [Ξ^{−}]^{−2})Φ), i.e. essentially a repeated computation of a scaled Gram matrix. In spite of the special structure of Φ^{⊤}Φ, the computational cost is of the same order as the solution of the linear system even when using a selfwritten routine for matrix multiplication tailored to the specific structure.
Declarations
Acknowledgements
The authors would like to thank Markus Martin for setting up the Bruker Daltonics HCT Ultra Ion Trap MS and Bart van den Berg for measuring the LCMS datasets used in the vignette of the R package IPPD. We thank Fredrik Levander and Thorsteinn Rognvaldsson for providing us the pepex implementation. We thank the reviewers and an associate editor whose comments and suggestions led to a substantial improvement over previous drafts.
Funding
Clusters of Excellence ‘Multimodal Computing and Interaction’ (to M.S., R.H. and B.G.), ‘Inflammation@Interfaces’ (to A.T. and T.J.) within the Excellence Initiative of the German Federal Government; DFG (grants BIZ4:14 to R.H. and A.H.).
Authors’ Affiliations
References
 Mo F, Mo Q, Chen Y, Goodlett DR, Hood L, Omenn GS, Li S, Lin B: WaveletQuant, an improved quantification software based on wavelet signal threshold denoising for labeled quantitative proteomic analysis. BMC Bioinformatics 2010, 11: 219. 10.1186/1471210511219PubMed CentralView ArticlePubMedGoogle Scholar
 Cox J, Mann M: MaxQuant enables high peptide identification rates, individualized p.p.b.range mass accuracies and proteomewide protein quantification. Nat Biotechnol 2008, 26: 1367–1372. 10.1038/nbt.1511View ArticlePubMedGoogle Scholar
 Renard B, Kirchner M, Steen H, Steen J, Hamprecht F: NITPICK: peak identification for mass spectrometry data. BMC Bioinformatics 2008, 9: 355. 10.1186/147121059355PubMed CentralView ArticlePubMedGoogle Scholar
 Hoopmann MR, Finney GL, MacCoss MJ: Highspeed data reduction, feature detection, and MS/MS spectrum quality assessment of shotgun proteomics data sets using highresolution mass spectrometry. Anal Chem 2007, 79: 5620–5632. 10.1021/ac0700833PubMed CentralView ArticlePubMedGoogle Scholar
 Gambin A, Dutkowski J, Karczmarski J, Kluge B, Kowalczyk K, Ostrowski J, Poznanski J, Tiuryn J, Bakun M, Dadlez M: Automated reduction and interpretation of multidimensional mass spectra for analysis of complex peptide mixtures. Int J Mass Spectrom 2007, 260: 20–30. 10.1016/j.ijms.2006.06.011View ArticleGoogle Scholar
 Mantini D, Petrucci F, Pieragostino D, Del Boccio P, Di Nicola M, Di Ilio C, Federici G, Sacchetta P, Comani S, Urbani A: LIMPIC: a computational method for the separation of protein MALDITOFMS signals from noise. BMC Bioinformatics 2007, 8: 101. 10.1186/147121058101PubMed CentralView ArticlePubMedGoogle Scholar
 Noy K, Fasulo D: Improved modelbased, platformindependent feature extraction for mass spectrometry. Bioinformatics 2007, 23: 2528–2535. 10.1093/bioinformatics/btm385View ArticlePubMedGoogle Scholar
 Kaur P, O’Connor PB: Algorithms for automatic interpretation of high resolution mass spectra. J Am Soc Mass Spectrom 2006, 17: 459–468. 10.1016/j.jasms.2005.11.024View ArticlePubMedGoogle Scholar
 Tibshirani R: Regression shrinkage and variable selection via the lasso. J R Stat Soc Ser B 1996, 58: 671–686.Google Scholar
 Du P, Angeletti R: Automatic deconvolution of isotope resolved mass spectra using variable Selection and quantized peptide mass distribution. Anal Chem 2006, 78: 3385–3392. 10.1021/ac052212qView ArticlePubMedGoogle Scholar
 Tibshirani R: Regression shrinkage and selection via the lasso: a retrospective (with discussion). J R Stat Soc Ser B 2011, 73: 273–282. 10.1111/j.14679868.2011.00771.xView ArticleGoogle Scholar
 Slawski M, Hein M: Sparse recovery by thresholded nonnegative least squares. In Advances in Neural Information Processing Systems 24. MIT press, Cambridge, Massachusetts; 2011:1926–1934.Google Scholar
 Lange E, Gropl C, Reinert K, Kohlbacher O, Hildebrandt A: Highaccuracy peak picking of proteomics data using wavelet techniques. Pac Symp Biocomput 2006, 11: 243–254.Google Scholar
 SchulzTrieglaff O, Hussong R, Gröpl C, Hildebrandt A, Reinert K: A fast and accurate algorithm for the quantification of peptides from mass spectrometry data. In Proceedings of the Eleventh Annual International Conference on Research in Computational Molecular Biology (RECOMB 2007), Volume 11,. Springer, Berlin; 2007:437–487.Google Scholar
 Zubarev R: Accurate monoisotopic mass measurements of peptides: possibilities and limitations of high resolution timeofflight particle desorption mass spectrometry. Rapid Commun Mass Spectrom 1996, 10(11):1386–1392. 10.1002/(SICI)10970231(199608)10:11<1386::AIDRCM652>3.0.CO;2TView ArticleGoogle Scholar
 Senko M, Beu S, McLafferty F: Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. J Am Soc Mass Spectrom 1995, 6: 229–233. 10.1016/10440305(95)000178View ArticlePubMedGoogle Scholar
 Horn DM, Zubarev RA, McLafferty FW: Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J Am Soc Mass Spectrom 2000, 11: 320–332. 10.1016/S10440305(99)001579View ArticlePubMedGoogle Scholar
 Lou X, Renard B, Kirchner M, Koethe U, Graf C, Lee C, Steen J, Steen H, Mayer M, Hamprecht F: Deuteration distribution estimation with improved sequence coverage for HDX/MS experiments. Bioinformatics 2010, 26: 1535–1541. 10.1093/bioinformatics/btq165View ArticlePubMedGoogle Scholar
 Suits F, Hoekman B, Rosenling T, Bischoff R, Horvatovich P: Thresholdavoiding proteomics pipeline. Anal Chem 2011, 83: 7786–7794. 10.1021/ac201332jView ArticlePubMedGoogle Scholar
 Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, New York; 2004.View ArticleGoogle Scholar
 Samuelsson J, Dalevi D, Levander F, Rognvaldsson T: Modular, scriptable and automated analysis tools for highthroughput peptide mass fingerprinting. Bioinformatics 2004, 20: 3628–3635. 10.1093/bioinformatics/bth460View ArticlePubMedGoogle Scholar
 Efron B, Hastie T, Johnstone I, Tibshirani R: Least Angle Regression (with discussion). Ann Stat 2004, 32: 407–499. 10.1214/009053604000000067View ArticleGoogle Scholar
 van de Geer S, Bühlmann P: On the conditions used to prove oracle results for the Lasso. Electron J Stat 2009, 3: 1360–1392. 10.1214/09EJS506View ArticleGoogle Scholar
 Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning, 2nd Edition. Springer, New York; 2008.Google Scholar
 Bruckstein A, Elad M, Zibulevsky M: On the uniqueness of nonnegative sparse solutions to underdetermined systems of equations. IEEE Trans Inf Theory 2008, 54: 4813–4820.View ArticleGoogle Scholar
 Wang M, Tang A: Conditions for a unique nonnegative solution to an underdetermined system. In Proceedings of Allerton Conference on Communication, Control, and Computing, Volume 49,. IEEE Press, Piscataway, New Jersey; 2009:301–307.Google Scholar
 Donoho D, Tanner J: Counting the faces of randomlyprojected hypercubes and orthants, with applications. Discrete Comput Geometry 2010, 43: 522–541. 10.1007/s004540099221zView ArticleGoogle Scholar
 Meinshausen N: Signconstrained least squares estimation for highdimensional regression. Tech. rep.. Department of Statistics, Oxford University; 2012.Google Scholar
 Meinshausen N, Yu B: Lassotype recovery of sparse representations for highdimensional data. Ann Stat 2009, 37: 246–270. 10.1214/07AOS582View ArticleGoogle Scholar
 Zhou S: Thresholding Procedures for high dimensional variable selection and statistical estimation. In Advances in Neural Information Processing Systems 22. MIT press, Cambridge, Massachusetts; 2009:2304–2312.Google Scholar
 Zhang T: Some sharp performance bounds for least squares regression with L1 regularization. Ann Stat 2009, 37: 2109–2144. 10.1214/08AOS659View ArticleGoogle Scholar
 Donoho D, Johnstone I: Ideal spatial adaption by Wavelet shrinkage. Biometrika 1994, 81: 425–455. 10.1093/biomet/81.3.425View ArticleGoogle Scholar
 Fan J, Li R: Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Association 2001, 97: 210–221.Google Scholar
 Wasserman L, Roeder K: Highdimensional variable selection. Ann Stat 2009, 37: 2178–2201. 10.1214/08AOS646PubMed CentralView ArticlePubMedGoogle Scholar
 Fan J, Guo S, Hao N: Variance estimation using refitted crossvalidation in ultrahigh dimensional regression. J R Stat Soc Ser B 2012, 74: 37–65. 10.1111/j.14679868.2011.01005.xView ArticleGoogle Scholar
 Wolski WE, Farrow M, Emde AK, Lehrach H, Lalowski M, Reinert K: Analytical model of peptide mass cluster centres with applications. Proteome Sci 2006, 4: 18. 10.1186/14775956418PubMed CentralView ArticlePubMedGoogle Scholar
 Perkins DN, Pappin DJ, Creasy DM, Cottrell JS: Probabilitybased protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20: 3551–3567. 10.1002/(SICI)15222683(19991201)20:18<3551::AIDELPS3551>3.0.CO;22View ArticlePubMedGoogle Scholar
 Friedman J, Hastie T, Tibshirani R: Regularized paths for generalized linear models via coordinate descent. J Stat Software 2010, 33: 1–22.View ArticleGoogle Scholar
 Hussong R, Tholey A, Hildebrandt A: Efficient Analysis of Mass Spectrometry Data Using the Isotope Wavelet. In COMPLIFE 2007: The Third International Symposium on Computational Life Science, Volume 940(1),. Edited by: Siebes APJM, Berthold MR, Glen RC, Feelders AJ. AIP, Melville; 2007:139–149.Google Scholar
 Hussong R, Gregorius B, Tholey A, Hildebrandt A: Highly accelerated feature detection in proteomics data sets using modern graphics processing units. Bioinformatics 2009, 25: 1937–1943. 10.1093/bioinformatics/btp294View ArticlePubMedGoogle Scholar
 Liu X, Inbar Y, Dorrestein P, Wyne C, Edwards N, Souda P, Whitelegge J, Bafna V, Pevzner P: Decovolution and database search of complex tandem mass spectra of intact proteins. Mol Cell Proteomics 2010, 9: 2772–2782. 10.1074/mcp.M110.002766PubMed CentralView ArticlePubMedGoogle Scholar
 Tholey A, Pipkorn R, Bossemeyer D, Kinzel V, Reed J: Influence of myristoylation, phosphorylation, and deamidation on the structural behavior of the NTerminus of the Catalytic subunit of CAMPDependent protein kinase. Biochemistry 2001, 40: 225–231. 10.1021/bi0021277View ArticlePubMedGoogle Scholar
 Reissner K, Aswad D: Deamidation and isoaspartate formation in proteins: unwanted alterations or surreptitious signals? Cell Mol Life Sci 2003, 60: 1281–1295. 10.1007/s0001800322875View ArticlePubMedGoogle Scholar
 SchulzTrieglaff O, Hussong R, Gröpl C, Leinenbach A, Hildebrandt A, Huber C, Reinert K: Computational quantification of peptides from LCMS data. J Comput Biol 2008, 15: 685–704. 10.1089/cmb.2007.0117View ArticlePubMedGoogle Scholar
 Du P, Stolovitzky G, Horvatovich P, Bischoff R, Lim J, Suits F: A noise model for mass spectrometry based proteomics. Bioinformatics 2008, 24: 1070–1077. 10.1093/bioinformatics/btn078View ArticlePubMedGoogle Scholar
 Bates D, Maechler M: Matrix: Sparse and Dense Matrix Classes and Methods. 2009. [R package version 0.999375–21] [R package version 0.99937521]Google Scholar
 Davis T: CHOLMOD: sparse supernodal Cholesky factorization and update/downdate. 2005.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.