Calibration of mass spectrometric peptide mass fingerprint data without specific external or internal calibrants

Background Peptide Mass Fingerprinting (PMF) is a widely used mass spectrometry (MS) method of analysis of proteins and peptides. It relies on the comparison between experimentally determined and theoretical mass spectra. The PMF process requires calibration, usually performed with external or internal calibrants of known molecular masses. Results We have introduced two novel MS calibration methods. The first method utilises the local similarity of peptide maps generated after separation of complex protein samples by two-dimensional gel electrophoresis. It computes a multiple peak-list alignment of the data set using a modified Minimum Spanning Tree (MST) algorithm. The second method exploits the idea that hundreds of MS samples are measured in parallel on one sample support. It improves the calibration coefficients by applying a two-dimensional Thin Plate Splines (TPS) smoothing algorithm. We studied the novel calibration methods utilising data generated by three different MALDI-TOF-MS instruments. We demonstrate that a PMF data set can be calibrated without resorting to external or relying on widely occurring internal calibrants. The methods developed here were implemented in R and are part of the BioConductor package mscalib available from . Conclusion The MST calibration algorithm is well suited to calibrate MS spectra of protein samples resulting from two-dimensional gel electrophoretic separation. The TPS based calibration algorithm might be used to correct systematic mass measurement errors observed for large MS sample supports. As compared to other methods, our combined MS spectra calibration strategy increases the peptide/protein identification rate by an additional 5 – 15%.


Background
Proteomics inter-alia focuses on the identification of peptides/proteins in complex biological samples [1]. Before the identification of the complex constituents, several sep-aration steps are required to reduce the sample complexity. The classical separation method is the twodimensional gel electrophoresis [2][3][4][5], followed by excision of the detected spots from the gel, digestion with sequence specific proteases and extraction of the cleaved proteins [6,7]. Mass Spectrometric (MS) analysis [8][9][10][11][12][13] of the resulting mixture of peptides yields a peptide mass fingerprint (PMF): a set of measured molecular masses of the proteolytic peptides derived from the analysed protein [14][15][16].
PMF commonly requires matrix assisted laser desorption/ ionisation (MALDI) time of flight (TOF) instruments, capable of high throughput analysis of complex samples with minimal pre-cleanup, high femtomolar range sensitivity and accuracy of peptide molecular mass determination up to 5 -10 parts per million (ppm) [17][18][19][20]. Due to the high ion transmission of the TOF mass analyzer, this technique is more sensitive compared with other MS techniques. In relation to Electrospray ionisation (ESI) MS [21], MALDI-MS is more tolerant to sample contamination resulting from salts and detergents often present in protein samples due to the separation method. MALDI-MS and ESI-MS have become the standard high throughput proteome analysis techniques in many research laboratories.
The experimental peptide mass lists are generated by the analysis of TOF spectra [22]. Ideally, the TOF is proportional to the square root of mass over charge . Thus, in order to transform the spectrum from TOF into m/z, two calibration constants A and B are necessary. These can be derived by measuring the flight times t of at least two different ions with known masses and fitting them such that . After the transformation from time into m/z, the mono-isotopic peptide signals in the spectrum are identified and their intensity is determined by computational methods [23][24][25][26]. The lists of the first mono-isotopic peptide peaks -further called peak-lists -are used to identify the protein of interest. In order to assign the PMF to a protein in a sequence database, database search algorithms use the match (within a given measurement accuracy) of theoretical peptide masses computed from protein sequence databases [27] with observed MS masses [15,16].
Usually the scoring schemes model the mass frequencies of the proteins and peptides in the sequence databases [24,[28][29][30]. Other properties to be considered include the different sensitivity of detection for individual peptides, known protein modifications, and/or possible mutations [23,[31][32][33], although generally, all popular search scores depend on the precise assignment of experimental to theoretical peptide masses.

Two novel calibration methods
In a high throughput setting [34,35], where the samples are placed on a moving sample support, the calibration coefficients for transforming the TOF into m/z differ depending on sample position. This is due to deviations in plate flatness, sample topography changing the size of the acceleration region [34,36], and alterations in the strength of the electric field on the sample support borders which influences the drift velocity of the ions [22]. Thus, when calibration constants determined from one position on the sample support are used to calibrate TOF spectra acquired on other positions (a procedure known as external calibration), the determined m/z values have errors of up to 500 ppm.
Calibration is usually performed using external [36][37][38] or internal calibrants [39,40], which rely on known masses to calibrate the spectra to common co-ordinates. It must be stressed, that in some cases the signal of a reference compounds might be suppressed by the analyte molecules, thus precluding internal calibration. In other cases, the reference signal may partially overlap with an analyte signal, resulting in an erroneous assignment. A third category of calibration methods is based on the peptide mass rule [23,24]. A major advantage of the latter method is that no internal calibrants are required to calibrate the peak-lists. The limitation of this method is it's sensitivity to the presence of non-peptide peaks in the spectra, and that it completely fails if the number of peptide peaks in peak-lists are small [23,24,39]. Therefore, in practice this method usually is used only to pre-calibrate [24] or to support the results of internal calibration [26,39].
We have developed two novel calibration methods for PMF data. Both calibration methods exploit similarities of peak-lists due to closeness in the origin of the analysed samples. The first method combines the computation of dissimilarities [41] between peak-lists with internal calibration. The second method employs spatial statistical methods [42] to model systematic changes of the calibration-model over the MALDI sample support. The major advantage of the presented methods originates from the fact that the MS calibration derives from samples without internal standards or external calibrants positioned on each sample support.

Evaluating the methods
To demonstrate the accuracy of our methods, we studied one sample set of 380 mass spectra, consisting of a part of the Arabidopsis thaliana proteome study [43]. For this purpose, a MALDI MS sample support in pre-structured [35] (384-well) microtitre plate format was used. The measurements were performed using the Autoflex MALDI-TOF MS [44] instrument.
To compare the performance of calibration methods described here with those already published [26,39], we used two different data sets. The first set consisted of 1193 spectra deposited on four pre-structured sample supports and measured on a Reflex MALDI-TOF MS [44] instrument (Reflex data set). Spectra were generated via mass spectrometric analysis of the Rhodopirellula baltica proteome (unpublished data). The second set was generated in connection with a proteome study of Mus musculus and consisted of 1882 spectra deposited on five pre-structured sample supports and measured on an Ultraflex MALDI-TOF MS [44] instrument (Ultraflex data set).
During MS sample preparation of the Ultraflex data set, standard peptides of known masses (human Angiotensin I -1, 296.6853Da, human ACTH (18-39) 2, 465.1989Da) were added before the measurement to the MS matrix. This was done because the data sets were optimised for the calibration methods, which required the internal calibrants. We examined if the standard peaks could be observed in more than 33% of spectra and if so, we removed the peaks matching these masses from the data set. This procedure was applied in order to simulate a data set not optimised for internal calibration.

Internal calibration using a pre-calibrated list of calibration masses
Internal calibration is a widely used method in mass spectrometry. This method fails however, either if no peaks matching known masses are present or if MS peak assignment is false. A detailed description of the application of internal calibration in a high throughput-MS setting, addressing the two points is given by e.g. Chamrad et al. [39], Levander et al. [40] and Samuelson et al. [26]. In order to avoid the lack of MS peaks matching the known calibration masses the authors used a pre-compiled list, e.g. trypsin autolysis peaks and unidentified, frequently observed masses [47].
Chamrad et al. [39] initiated the calibration procedure with searches for matching masses using a relatively large search window and iterated it with an increased accuracy. In this scheme, a large search window allows false assignments for calibration masses to occur more frequently. If a false assignment occurs in the first iteration, then the determined calibration constants are false and the entire calibration would be wrong. In the next round of calibra-tion, where a search for matching masses is performed with a higher mass accuracy, the calibration would also fail. To prevent this, the authors [26,39] checked the obtained calibration coefficients against the peptide mass rule (PM-rule) [24,48] and stopped further calibration attempts where they disagreed substantially.
Levander et al. [40] introduced an adaptive method to eliminate low-sensitivity auto-proteolysis trypsin peaks from the calibration mass list if no high-sensitivity trypsin peaks e.g. (842.5099Da, 1045.5642Da, 2211.1046Da) were found to decrease the chance of false matches. Unfortunately, this method could only be applied for "tryptic" calibration peaks.
Figures 1A &1B demonstrate the limitations of a calibration list compiled from ubiquitous masses of the whole data set. One can recognise that out of three abundant masses (in red, Figure 1A), only two can be practically used for calibration. Specifically, the first and the third abundant mass in the list of ubiquitous masses ( Figure  1A) match simultaneously two peaks in peak-list 3, 4 and 5 ( Figure 1B). Thus, out of five peak-lists only three could be calibrated. The second calibration mass is also of no use, since it is the only calibration mass in the peak-lists 1 and 2 (although these peak-lists do contain other shared masses). This illustrates that the usage of a global calibration list may fail to calibrate a set of peak-lists.
It is therefore feasible to address the following questions: How can one obtain a short calibration list to avoid spurious matches while at the same time it matching a sufficient number of peaks in every peak-list of the set? In addition, how can one minimise the initial search window to avoid false matches?

Finding the optimal multiple peak-list alignment using a modified Minimum Spanning Tree (MST) algorithm
In order to bypass the limitations imposed by global calibration we used an observation made by Schmidt et al. [49]. They noticed that protein samples excised from high-resolution 2D-gels are usually not ideally separated and therefore exhibit local similarities. Compiling a calibration list of abundant masses from a whole data set obtained from a 2D-gel does not differentiate local spectra similarities. For example peak-lists 1, 2 and 3 ( Figure 1B) share peaks, which were not recognised as ubiquitous masses and hence not used further for calibration using a global calibration list. The peak-list pairs (2,3) and (1,3) shared more than one peak, thus allowing an easy calibration.
We explored the property of local pairwise peak-list similarities for calibration of data sets. To achieve it, we used a modified minimum spanning tree MST [50] algorithm on the complete weighted graph G(V, E, d), where the vertex set V corresponds to the individual peak-lists and the edges E are weighted by a dissimilarity measure d. We denned the measure between two peak-lists p 1 and p 2 as d(p 1 , p 2 ) = -s(p 1 , p 2 ), where s represented a similarity measure denned in Equation 10. This measure not only counts the number of matching peaks, but also weights the mass range enclosed by them. Hence, it also considers that if the matching masses lie very close to each other, the calibration model describes a small mass range only, and can result in a large error when aligning masses that are out of this range. Using the dissimilarities one can compute a MST ( Figure 1D). The algorithm to compute the MST of the peak-list data set starts by choosing a peak-list (named s), which belongs to the peak-list pair of smallest dissimilarity, for example peak-list 2 or 3 in Figure 1. This peaklist is the root of the growing tree T (Figure 8 line 1). Next, a peak-list v was chosen, which easily could be aligned to peak-list u where v is a part of the growing tree i.e. u T ( Figure 8 line 5), for example peak-list v = 2 can easily be aligned to peak-list u = 3. Using linear regression, we computed the coefficients c(v, u) = (c 0 , c 1 ) of the affine function, modelling the absolute mass differences of the peaks matching in the peak-list pair (v, u). Having these coefficients one can compute the calibration coefficients c(v, s) using the update rule in Equation 11, which described the mass measurement error (MME) between the peak-list v and the starting peak-list s. The calibration is not terminated until the whole tree is built. We then added peak-list v to the tree T and have iterated the procedure until all peaklists were appended to the tree, for example by adding peak-list 4, then 5 and finally 1 to T ( Figure 1D).
In the MST algorithm, the vertices are joined by edges of smallest dissimilarity. Consequently, the MST algorithm connects all peak-lists in the data set in the way that the length of the path from the peak-list of origin (root of the tree: peak-list 3 in Figure 1D) to any peak-list in the data set is minimal. The algorithm for computing the agglomerative clustering using the single linkage method [51,52] works similarly like the MST algorithm and therefore the dendrogram ( Figure 1C) provides (as read from bottom to top) the order, by which the peak-list pairs were chosen. The horizontal lines joining two dendrogram tree branches were drawn at the height of the value of the minimal dissimilarity of two peak-lists in either branch.
Finally, the algorithm returns a list of coefficients and a measure of confidence for all peak-lists equalling the smallest similarity in the path from s to v. Figure 2A demonstrates how the samples on the target are connected by the edges. Green dots (brighter) represent leaves, while blue dots (darker) denote interior vertices. The peak-list of origin s is marked with a red cross-hairs (sample position D15). Note that long peak-lists (brighter squares) are interior vertices of the MST.
The strip-charts of mass ranges including peaks of the trypsin autolysis products 842.508 and 2, 211.100 are presented in Figure 2C 1 and C 2 . One can observe that the MST-method works robustly on raw data with a mass measurement error of up to ± 0.7Da (black crosses), even if the search for matching peaks when computing the similarities and calibration coefficients was performed within a much smaller window of ± 0.45Da. Notably, if the maximal error among two peak-lists is much larger than the search window, it is still possible to find a path, thus allowing alignment of two extreme peak-lists.
Due to the fact that all peak-lists were aligned to the peaklist of origin s, which did not necessarily match to the theoretical trypsin autolysis masses, a final correction was required to calibrate the whole tree to the theoretical coordinate system before database searches (not shown).

Determining the calibration model of the sample support using Thin-Plate Spline interpolation (TPS)
Because a large part of the MME is of systematic origin and depends on the sample support position, the mapping of the calibration coefficients across the entire MALDI plate was introduced by Gobom et al. [36] and Moskovets et al. [38]. The calibration coefficients were determined using a standard mixture of peptides with known masses. Subsequently, the calibration coefficients were used during MS analysis in order to correct for the masses measured afterwards on the same plate.
We introduced here a method that derives the calibration model from calibration coefficients acquired from samples, which do not necessarily contain internal standards. Instead of refining the MST calibration model, we chose the peptide mass rule based approach, namely Linear Regression on Peptide Rule (cf. Methods), to obtain the calibration coefficients. The methods based on the peptide mass rule do not rely on the specification of an initial search window or on internal calibrant masses. The peptide rule based calibration method calibrates the peak-lists into the theoretical co-ordinate system and increases the mass accuracy to approximately 0.1Da, but fails if the peak-list is too short, which indeed could be observed for several samples ( Figure 3A and 3C). Figure 3A provides the color scheme coded slope coefficient c 1 as determined by the peptide rule based calibration method in dependence of the target location. One can observe that some erroneous predictions occur ( Figure 3C; black crosses marked by magenta triangles).
However, it is unbiased to assume a smooth transition between adjacent positions of the sample support. For example, Figure 2B demonstrates that the slope coefficient of the sample calibration-model obtained by the MST calibration methods increases for samples close to the support border. This change is due to alterations in the electric field E (Equation 1) influencing the flight velocity given by where s a is the size of the acceleration region, z is the ion charge and m is the mass of the ion. We determined the A: Colour scheme coded peak-list lengths in dependence of the sample support position systematic change of the slope using the Thin-Plate Spline (TPS) interpolation method [42,53]. At first, we computed the TPS with a degree of smoothing λ = 5·10 -2 (see Equation 15). Calibration models with slope coefficient c 1 that varies more than ± 1·10 -4 or with intercept coefficient c 0 varying more than 0.2Da from the one predicted by the TPS were discarded. Using the remaining calibration models, the TPS was recomputed with smaller degree of smoothing λ = 1·10 -3 . Figure 3B, demonstrates the Colour scheme coded slope coefficient c 1 , as estimated by the refined TPS. This model resembles the one generated by the MST method ( Figure 2B). We corrected the peak-lists masses (black cross hairs, Figure 3C), using the TPS values as estimates of the slope coefficients, and as intercept estimate we used the average intercept of all coefficients of the refined calibration models to obtain the calibrated masses (red circles).
The TPS method reduced the MME of a peak-list compared to any other peak-list in the data set (vertical red, dashed line in Figure 3C) down to 0.3Da, as compared to 1.5Da for raw data. This is approximately a 5-fold increase of a mass measurement accuracy. This decrease of the MME enabled us to utilise the MST-algorithm with an accuracy of ± 0.15Da, reducing further the probability of false assignments of calibration masses. In addition, the histogram of dissimilarities computed for all peak-list pairs ( Figure 4A) shows for TPS calibrated data lower values of dissimilarity (in red) as compared to the raw data (in grey), even if the first dissimilarities were computed with a search window of 0.15Da and the second ones with a search window of 0.45Da. A subsequent calibration using the MST method decreased further the MME ( Figure  4B).

The mass measurement error
Prior to the calibration, the main error source is due to different drift velocities of the ions causing an increase of the absolute MME, proportional to mass and best described by the slope coefficient c 1 ≠ 0 and measured as relative error using parts per million ppm (Table 1, row 1 and 2). After removal of this error using calibration methods, for example the TPS calibration (Table 1, row 3,4) or TPS with subsequent MST calibration (Table 1 row 5,6), the main contribution to the MME was due to peak detection performance. We were aware, however, of systematic changes A: Colour scheme coded slope coefficients c1 of the MME determined by the peptide rule based calibration method of the MME, which can be described using higher order polynomials [37,54]. We have removed higher order terms of the MME, by applying external calibration before to other calibration procedures (cf. Methods : External Calibration). The change of peak-detection quality was negligible in the range of 500 -4000Da. Figure 5, as well as Table 1, illustrates that after calibration the absolute MME was smaller for the peak with higher mass (2211.1) than that of the peak with a lower mass (842.508) if the peak intensity and consequently the Signal to noise ratio remained sufficiently high. Therefore, we performed the database searches by specifying the search window in Da instead of ppm.
The optimal size of the search window Figure 5 and Table 1 demonstrate that it is possible to reduce the mass measurement error to approximately ± 10 ppm for most of the peak-lists in a dataset consisting of 380 spectra, by applying the TPS-MST calibration sequence. Nevertheless, in this dataset one can observe peak-lists that do not exhibit such high mass measurement accuracy. Consequently, if the database searches were performed with a search window of 10 ppm, these PLs would not be identified.
The optimal size of the search window was determined by searching of four internally calibrated data sets with five different search window sizes, namely 0.5, 0.2, 0.1, 0.05 and 0.02Da using the Mascot [55] search algorithm. The search window of 0.2Da generated the highest identification rate. Figure 6 shows the relative identification rate (identification rate / max(identification rate)·100%).
Allowing the search window to be larger e.g. 0.5Da, decreases the identification rate by increasing the rate of false negatives, while a smaller window e.g. ± 0.05Da decreases it by rejecting true matches [55]. Because the identification rate for a search window of 0.1Da is only slightly worse than one of 0.2Da, and since it minimizes A: Histogram of pairwise peak-list similarities the risk of false positive matches, we further compared the practical performance of the calibration methods with a search window of 0.1Da.
Prior to the database searches we removed all masses that occur in more than 8% of spectra, as it significantly increased the identification rate [39,40] (cf. Methods -Filtering of ubiquitous masses prior to database search). The sequence data base search was performed using the Mascot [55] search software version 1.8.1. We interfaced the search server from within R using the in-house developed R package msmascot [56].

Combining different calibration methods and their comparison
All parameters were fitted to a data set optimised for internal calibration, measured on an Autoflex MALDI-TOF MS [44] instrument. We applied the calibration methods introduced (MST and TPS based calibration) without changing the parameters to two sample sets obtained using two different instruments, namely a Reflex MALDI-TOF MS and a Ultraflex MALDI-TOF MS instrument. This was executed to illustrate that our methods are robust with respect to different instruments even if the parameters were not optimised for the respective machines.
Stick spectrum of the merged data set of 380 peak-lists Figure 5 Stick spectrum of the merged data set of 380 peak-lists. The black vertical lines represent peaks calibrated using the TPS and MST method. Their height equals their intensity. Green line -average mass of all peaks in the region 842 -843Da (A) and 2210.5 -2211.6Da (B). The orange vertical lines represent the average mass ±, the standard deviation of the peak masses in each region. Magenta line -density of peak-masses.    We combined the different pre-calibration and calibration methods resulting in six different calibration sequences (summarised in Table 2). We compared the performance of the MST and TPS calibration sequence to the internal calibration (IC), and the peptide rule based calibration methods (LR/PR). Furthermore, we investigated if the identification rate of the TPS based method could be improved further by subsequent internal (TPS-IC) or MST calibration (TPS-MST). The R [57] scripts implementing each sequence can be found in the samples directory of the mscalib BioConductor [58] package.
The only calibration method for which parameters were optimised with respect to the instrument was the standard internal calibration (IC) method, which employs a precompiled calibration list of theoretical trypsin autolysis peaks and a calibrated set of ubiquitous masses (cf. Methods -Standard internal calibration). In case of the peptide rule based calibration (LR/PR) method we applied an additional filtering of the calibration-models. Only models with an intercept coefficient c 0 satisfying -0.4Da <c 0 < 0.4Da and slope coefficients c 1 with -5·10 -3 <c 1 < 5·10 -3 were kept. In order to avoid falsely calibrated peak-lists we performed the filtering.
The identification rates were defined as the number of identified samples by at least one of the calibration sequences divided by the number of samples submitted for searches where CS i indicates the set of identified samples by one of the calibration sequences (Table 2), and #{A} denotes the number of elements in a set A. The identification rates were 74%, 87%, 79%, 85% for the Pirellula (Reflex) data set, with an overall identification rate of 82%, whereas for the Mus musculus (Ultraflex) data set they were 51%, 72%, 35%, 51%, 27%, with an overall identification rate of 58%. The lower identification rate of the Mus musculus data set can possibly be explained by the fact that it was matched with a larger database. Therefore, more matching peaks are required to make significant assignments to a data base entry.
In order to directly compare the identification rates for both data sets and each calibration sequence, we computed the relative identification rate. It was defined as the ratio of the number of identified samples calibrated by a sequence (numerator) and of the number of identified samples, which could be identified by at least one method (denominator): The relative identification rate is indicated by the dots, joined by continuous lines for readability purposes only, in Figure 7. The dashed lines denote the average of the sequence coverage of all identified samples. Figure 7A The optimal search window Figure 6 The optimal search window. Comparison of the relative identification rates of internally calibrated data (Y-axis) given a search window size of 0.5Da, 0.2Da, 0.1Da, 0.05Da and 0.02Da, respectively (X-axis). Red -Two Reflex (Pirellula) dataset, Black -Two Ultraflex (Mus Musculus) datasets.    Figure 7B shows the results of five Mus musculus data sets.
Only in one case of one data set was a single calibration sequence TPS-MST (see Table 2) able to identify all peaklists (100% identification rate) and therefore it completely dominated over the other methods (black line, Figure 7A). In the case of the Ultraflex data set ( Figure 7B) we observed that the TPS-MST method had the highest identification rate, while in Reflex data set ( Figure 7A) it achieved the highest performance for approximately half of the data sets. Figure 7C illustrates the averaged relative identification rate of the calibration methods for the Ultraflex and Autoflex data sets. In addition, it demonstrates that the ordering of the calibration methods according to the relative identification rate does not depend on the value of the Probability Based Mowse Score [55] (PBMS) used as identification threshold. The dashed lines ( Figure 5) indicate the identification rates obtained for a PBMS 5 units higher than the one used to identify the samples with a 0.5% significance level (continuous lines).
Interestingly, the TPS smoothing method resulted in an overall higher identification rate than the other methods tested on raw data (peptide rule based calibration, internal calibration, MST-calibration), except for one case of the Ultraflex data set. Furthermore, a combination of the internal calibration with TPS calibration (TPS-IC) did not increase either the sequence coverage (dashed lines) or the identification rate of the TPS method applied alone.
In two out of the four Reflex data sets, the MST method applied on TPS-processed data (P-TPS Figure 7A, dashed lines) slightly decreased the sequence coverage indicating a reduction of calibration accuracy. For the Ultraflex data sets, the sequence coverage correlated well with the identification rate and the TPS-MST-method accomplished the highest performance.
Moreover, if similar identification rates of the peptide rule based calibration and the internal calibration were observed, the peptide rule based calibration method provided higher sequence coverage ( Figure 7B). This could be explained by the fact that the peptide rule based method calibrated well the peak-lists possessing many peptide peaks. Such peak-lists potentially contain the higher sequence coverage.

The BioConductor package mscalib
All of the calibration methods are part of the mscalib programme, which is available as a BioConductor [59] package. The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics [58]. The scripts carrying out the calibration sequences tested, can be found in the subdirectory/samples of the package. Furthermore, in the same directory and in the directory/doc there are two vignettes [60] with detailed descriptions of two selected calibration sequences.

Conclusion
While the methods described in this study significantly improve the calibration of raw data, they do not perform better than other published calibration routines which reduce the MME to 10 ppm or below. The real advantage of the methods described here is that they are not dependent on the presence of internal or external calibrants, required to correct for the affine component of the MME. Furthermore, the calibration methods described in this study allow a larger fraction of peak-lists in the datasets to be calibrated than the reference internal calibration method would do.
The TPS method deals with systematic detrimental calibration effects that are due to imperfections in the geometry of the electric field over the MALDI sample plates. Usage of TPS calibration results in up to 10% higher identification rates, at least for the Bruker mass spectrometers, than the internal calibration. The TPS calibration procedure enables, for most of the samples deposited on the sample support, to obtain mass accuracy in the range of ± 0.1Da. Moreover, the TPS method does not require the presence of internal calibrants since it relies on calibration coefficients acquired from a calibration method based on the peptide mass rule.
The MST method is able to increase the identification rates obtained by the TPS-method for protein samples separated by a 2D-Gel electrophoretic procedure. Furthermore, the parameters optimised for one instrument (Autoflex) can be directly utilised for other instruments (Reflex, Ultraflex).
In this work, we have only examined a version of the MST algorithm that builds a single tree for all peak-lists. This is adequate if the data are a set of peak-lists with smooth transitions in the similarity values. If this is not the case, it might be more appropriate to compute a forest of several MSTs. We have examined, however, only a single peak-list Modified Dijkstra-Prim MST algorithm Figure 8 Modified Dijkstra-Prim MST algorithm. The algorithm starts with vertex s (peak-list) belonging to the peak-list pair with smallest distance (line 1) (the standard algorithm starts with an arbitrary pair). In addition to computing the MST T, the algorithm computes the calibration constants C(v, s) (line 8) and the connection weight W(u) (line 9).
Input: A graph G with m edges; each edge e has a given length l(e).

Initialise:
1 Pick a vertex s, which is incident to the edge with smallest distance D(e).
2 Set U := s and let T be a tree with one vertex, namely s. 6 Add uv to T and remove it from G by setting D(uv) = ∞.
7 Add v to U .

Output:
10 T -which is a maximum spanning tree.
11 C -which is the calibration list to align all peak-lists (vertices) to the starting peak-list (vertex) s.
12 W -which are the weights of the path from s → v ∈ F .
similarity measure (Equation 10) for peak-lists calibration. It is possible that better similarity measures can still be generated and subsequently applied for peak-lists calibration.
Complete utilisation of microtitre plates and sample supports is not only rational with respect to increased accuracy of the TPS method, but also with respect to the idea of high throughput experiments -maximal utilisation of energy and resources. Dense excision of spots from 2Dgels not only increases the performance of the MST method, but also identifies novel proteins. Hence, the main contribution of this manuscript is to present two calibration methods, compatible with the principle of high throughput sample processing and aims to identify a maximum of the proteins resolved on 2D-gels.
However, no single "best-calibration" method exists. Each of the methods utilises different properties of the peaklists. Consequently, applying these methods in parallel and determining the total (union) of the identified samples provides the highest identification rate.

Data sets
In this study, we used three data sets generated in different proteome analyses: All PMF MS spectra derive from tryptic protein digests of individually excised protein spots. For this purpose, the whole tissue/cell protein extracts of the former mentioned organisms were separated by two-dimensional (2D) gel electrophoresis [4] and visualised with MS compatible Coomassie brilliant blue G250 [43]. The MALDI-TOF MS analysis was performed using delayed ion extraction and by employing the MALDI AnchorChip ™targets (Bruker Daltonics, Bremen, Germany). Positively charged ions in the range of 700 -4, 500 m/z were recorded. Subsequently, the SNAP algorithm of the XTOF spectrum analysis software (Bruker Daltonics, Bremen, Germany) detected the monoisotopic masses of the measured peptides. The sum of the detected monoisotopic masses constitutes the raw peak-list. Before affine mass calibration, mass measurement errors which can be described by higher order polynomials and determined using external calibration (cf. Methods: External Calibration), were removed. Processed peak-lists were then used for the protein database searches with the Mascot search software (Version 1.8.1) [55], employing a mass accuracy of ± 0.1Da. Methionine oxidation was set as a variable and carbamidomethylation of cysteine residues as fixed modification. We allowed only one missed proteolytic cleavage site in the analysis. depending on whether the relative or absolute error was used, to obtain corrected masses m corr .

Affine MME model
In the first approximation, the MME can be described by an affine function , where m i is the mass of the matching peaks. The intercept and slope coefficients of this function can be determined using linear regression.
If only one matching peak was found or the mass range enclosed by the matching masses was small (e.g. less than 200Da), as a remedy one can fix: •  To correct the experimental masses m exp we used Equation 5 for the absolute differences ∆ A of matching peaks and Equation 4 in case of relative differences ∆ R .
The difference between theoretical and measured masses is called a mass measurement error MME, while the alignment of m exp on m theo an internal calibration [23,54,61].

Determining ubiquitous masses and their filtering
To determine the abundant masses we computed two histograms for each data set. The origin in the first histogram is where l = (max(M) -x 0 ) mod h. Formally the histogram of counts f is given by [62] where n represented the number of masses in M. If a bin had more counts than a given threshold, the average mass of all peaks in the bin was computed. In the case of two adjacent or overlapping bins B 1 , B 2 with a significant number of counts c, we first computed a weighted average of the bin midpoints using the number of counts in each bin as weight where m 1 and m 2 are the bin midpoints. Afterwards, the average mass of all peaks in the range m ± h/2 was computed. All peaks with mass m ∈ [ ± h/2] were subsequently removed from the data set. Using two overlapping histograms allows the detection of clusters that are scattered over two adjacent bins in one of the histograms. Different ways to determine ubiquitous masses were used and reported by Levender et al. [40] and Kreitler [63].

Standard internal calibration -Alignment to a precompiled list of calibration masses
Instead of using a predefined list of calibration masses, we chose the calibration masses adaptively. The calibration list consisted of ubiquitous masses determined for the data set (cf. Determining ubiquitous masses). Some of the peaks in the list of ubiquitous masses could be assigned to tryptic autolysis products.
These matches were used to calibrate the abundant masses. The peak-lists in the data set were then aligned to the calibrated list of ubiquitous masses.

Filtering of ubiquitous masses prior to database search
We removed ubiquitous masses that occurred in more than 7.7% of peak-lists [39,40]. Filtering of ubiquitous masses was performed on a calibrated set of peak-lists. As a result, we could use a small bandwidth of h = 0.2Da (Equation 6) to determine ubiquitous masses. Next, we checked which of them can be assigned with a significant Probability Based Mascot Score (PBMS) to a sequence database entry and subsequently removed these masses from the filtering list. Abundant masses assigned to a database entry usually result from proteins multiply detected on a 2D-gel. The multiple identification is due to different localisation of the protein on the 2D-gel caused by: protein modifications (phosphorylation, glycosylation), different splice variants or by partial protein degradation. Finally, we removed all peaks within the range ± 0.1Da around the ubiquitous masses. Wolski et al. (publication in preparation) defined the distance measure which computes given λ DB (the average peptide cluster distance for a sequence database DB against which the search is performed, e.g. λ DB = 1.000495) the deviation of a peptide mass difference |m i -m j | from the closest monoisotopic mass predicted by the PM-rule [48]. If there was a linear dependence between |m i -m j | and d λ (m i , m j ), then it was caused by the slope of the MME. If we computed all differences |m j -m i | and d λ (m i , m j ) for peak pairs m i , m j with |m i , m j | < 1400, we could determine the slope coefficient c 1 using linear regression, while fixing the intercept to zero [64]. In order to make the prediction robust against e.g. non-peptide peaks, we used a robust linear regression [65]. We removed the slope by multiplying each mass m i in the peak-list by (1 -c 1 ). Next, we identified the intercept, which was the average of the distance d λ (m i , 0), and corrected for it.

External calibration
In order to model higher order systematic changes of mass dependent differences ∆ of experimental m exp and reference masses m theo , the measurements must be evenly distributed over the whole measurement range [37,66]. To model the dependence ∆ ∝ m we used a cubic smoothing spline function [67,68], given by ∆ = f(m) + ε i , where f is a smooth function, and ε i ~ N(0, σ 2 ). In our study, we used an implementation of the smoothing spline function, provided by B.D. Ripley and Martin Mächler (based on Fortran code of T. Hastie and R. Tibshirani) as part of the R-stats package. Other non-parametric regression methods like local polynomial regression [69] generated similar results for all types of instruments used in this study.
To obtain equidistantly spaced measurements of known masses, External calibration was employed. Some sample spots on the sample support are dedicated to calibration only. Calibration samples, of polymer mixtures [36], which yield equidistant peaks were used to precisely estimate the mass-dependent difference function.

Similarity/quality measures for internal calibration
Peak-lists can be easily aligned if they contain many matching peaks and the masses of these peaks span a wide mass range. The alignment of a peak-list pair (X, Y) fails if no matching peaks are found. We described these properties mathematically by the following similarity measure: where n represented the number of matches, while m i and m j were the masses of matching peaks. This measure computed the sum of all mass differences of the matching peaks. The power p could be used to weight the large differences stronger.

Alignment of a set of peak-list using a Minimum Spanning Tree
To align a whole data-set to a single peak-list and to align the peak-lists with the highest similarity given by Equation 10, we computed for all peak-lists pairs a distance matrix D by casting the similarities into dissimilarities. This distance matrix can be represented by a complete, weighted graph G, where the vertices V correspond to peak-lists and the edges are weighted with the pairwise dissimilarity. To connect all vertices in the graph G with edges e of maximal similarity, the Dijkstra-Prim algorithm for finding the Minimum Spanning Tree(MST) [50] was implemented. We present here a modified version of this algorithm (see Figure 8). The algorithm was modified with respect to the starting conditions. As a starting-vertex s we chose a vertex incident to an edge of smallest distance. In addition to the MST tree T, the algorithm returns also a list of calibration coefficients C, which align all peak-lists V in the data set to the starting vertex (peak-list) s, and a list with connection weights W. We applied the calibration constants in C w to align all peak-lists to the peak-list s.
• TOF -Time of Flight.