Identifying biological concepts from a protein-related corpus with a probabilistic topic model

Background Biomedical literature, e.g., MEDLINE, contains a wealth of knowledge regarding functions of proteins. Major recurring biological concepts within such text corpora represent the domains of this body of knowledge. The goal of this research is to identify the major biological topics/concepts from a corpus of protein-related MEDLINE© titles and abstracts by applying a probabilistic topic model. Results The latent Dirichlet allocation (LDA) model was applied to the corpus. Based on the Bayesian model selection, 300 major topics were extracted from the corpus. The majority of identified topics/concepts was found to be semantically coherent and most represented biological objects or concepts. The identified topics/concepts were further mapped to the controlled vocabulary of the Gene Ontology (GO) terms based on mutual information. Conclusion The major and recurring biological concepts within a collection of MEDLINE documents can be extracted by the LDA model. The identified topics/concepts provide parsimonious and semantically-enriched representation of the texts in a semantic space with reduced dimensionality and can be used to index text.


Background
The microarray technology provides a way of simultane-ously measuring transcript abundances of 10 3 -10 5 genes from one or more cell or tissue samples. A microarray, also known as a gene chip, has well defined regions that each consists of immobilized sequences of DNA, which each is unique to a specific gene. These regions are referred to as probes [1]. When fluorophore labeled cDNA, referred to as targets, obtained by reverse transcription of mRNA extracted from the samples of interest is let to hybridize to the probes for a few hours, each region on the microarray will specifically bind a certain amount of hybridized DNA unique to the corresponding gene. Depending on if a twochannel or single-channel microarray platform is used, either several and differentially labeled targets are hybridized to the same array, or different targets are each hybridized to separate arrays using identical labels. Next, the array is scanned at different wavelengths to excite the fluorescent molecules using a light source, for instance a laser. Shortly after the fluorophores have been excited they emit photons, which are registered and quantified in each position by the scanner, which results in a high-resolution digitized image for each channel. Using image analysis methods, the pixels that belong to the regions that contain the probes are identified and averaged, and an estimate of the transcript abundance for each gene is obtained.
Since these estimates are obtained from a complex measurement process of several steps, it is likely that the observed signals contain not only measurement noise, but also systematic variations of different kinds [2].
In this report, we show the existence of a channel-specific bias introduced by the scanner and most likely its detector parts. Our results indicate that the image analysis may also contribute with a small bias. The effects channel-specific biases have on the downstream microarray analysis are many [2,3]. We suggest a scan protocol and a model that will allow us to estimate the biases and calibrate the observed signals accordingly. The result will be that the intensity dependent effects are removed, but also that the effective dynamical range of the scanner is increased several times.

General model
Consider a microarray experiment involving genes i = 1 ,..., I from RNA extracts c = 1 ,..., C. In single-channel microarrays each array measures the gene expression levels in one RNA extract, whereas in two-color microarrays each array measures two RNA extracts, one in each channel. We will refer to each set of signals from each RNA extract as channels. Let µ c,i be the true gene expression (transcription) level of gene i in channel c. Ideally, statistical analysis can then be done on these quantities. For instance, by comparing the relative abundances in two channels, that is r i = µ 1,i /µ 2,i for all genes i, it is possible to identify genes that are significantly differentially expressed (r i ≠ 1). However, in reality we do not observe the true expression levels, but only the quantified spot intensities y c,i . The general relation between the observed and the true expression levels can be written as (1) where f c (·) is an unknown channel-specific function, which we refer to as the measurement function, that includes all steps in the microarray process. Moreover, we assume independent intensity dependent error terms ε c,i such that E[ε c,i ] = 0. Because we want to do inference based on µ c,i , it must be possible to find the inverse of f c (·), which (at least in theory) is possible if it is strictly increasing.
To be able to find the form of f c (·), high quality calibration data from several stages along the microarray process is required. Here we will consider a simpler case. Split the overall measurement function into two parts. The first part x c,i = g c (µ c,i ) models, in channel c, the amount of light from spot i that enters the photomultiplier tube (PMT) [4] as a function of the transcription level of clone i. The second part, which is studied in this report, is y c,i = h c (x c,i ) and models the observed signal as a function of the amount of photons in channel c and spot i that enters the PMT. That is, it captures the characteristics of the scanner's light detector, but also of the image analysis methods. We want to emphasize that the light from one spot does not necessarily originate solely from the fluorescent molecules that are attached to the hybridized target DNA. Light from other sources such as cross-hybridized target, intrinsic fluorescence from spot buffer, and scatter light may also contribute with photons of similar wavelengths.
Next we will show that h c (·) is almost perfectly affine. This measurement function also depends on the scanner settings, especially the scanner sensitivity, which is indicated below with the super index (k). In other words, where for each fix scanner setting k, and are channel-specific bias and scale parameters, respectively. Assume that x c,i is fix for all PMT voltages.
Note that the above relationship is not necessarily linear.
Here we refer to linear in the strict sense where a c = 0 so that the output is proportional to the input. Lack of linearity has important implications on downstream analysis. For instance, when spotted as well as in-situ synthesized microarrays are used it is common to do statistical analysis on the log-ratios M i = log 2 (y R,i /y G,i ) and the log-intensi- for all genes i [5], where we for convenience have denoted the two channels to be compared by R and G although such comparisons are not limited to within-array measurements. One of the rationales for this bijective transform is that under ideal conditions the main measure of interest, the fold change, is contained in one variable only. However, a channel specific bias introduced by f c (·) will introduce a so called intensity dependent bias in the observed log-ratios. Commonly observed intensity dependent effects in the log-ratios [6] can partly be explained by the fact that the logarithm is taken on affinely transformed signals [2,3,7].

Constrained model
The model in equation (2) is not identifiable. We address this problem as follows. Consider the case where the same array has been scanned at K different PMT settings. Let be the vector of the K quantified signals for gene i and channel c. In the noise-free case it follows from (2) that lies on the line L(a c , b c ) in K , which has direction and goes through the point . The 2K parameters of a c and b c are not identifiable, since L has only 2K -2 degrees of freedom. In fact, any transformation b c ← k·b c and a c ← a c + l·b c , where k and l are scalars, will leave L intact. In this paper, we make a c and b c identifiable by choosing k and l so that = 1 and a c is the point on L closest to the (diagonal) line L' = {e c (1,..., 1); e c ∈ }. The choice of a c can be motivated by looking at observed data.
By inspection, we observe that the bias in Model (2) is not varying much when the PMT gain is changed. To demonstrate this, have been plotted for each of the six possible PMT pairs in Figure 1. First, the close fit of the lines to data is evidence that the scanner is linear in its dynamical range. Second, all lines go through approximately the same point, lets call it (e c , e c ), suggesting that there is a common PMT-independent bias e c . More precisely, we split the bias term into two parts, one dependent and one independent on the PMT gain according to Affine relationship between quantified fluorescent intensities and concentration of fluorophores Figure 1 Affine relationship between quantified fluorescent intensities and concentration of fluorophores. Scanning the arrays at different PMT gains indicates that there is an affine relationship between quantified fluorescent intensities and concentration of fluorophores. Left: Observed signals in the green channel for PMT pairs (800, 500), (700, 500), (800, 600), (600, 500), (700, 600), and (800, 700) are shown in green, red, cyan, black, blue and magenta, respectively. An affine model is estimated for each pair and displayed as a line. Data points at the very top are saturated and ignored. Right: A zoom-in of the same data. For each pair the estimated , which corresponds to the true origin, has been highlighted with a circle. All lines seem to intersect at the same point on the line. Shown are signals from the green channel on Array A quantified by GenePix from Axon scanner images.  (4) is the same as finding the shortest distance between the line L and the diagonal line, which is also the distance between the two points a c and e c . From this geometrical interpretation it is also clear that in order for the parameters to be uniquely identifiable the line L must not be parallel to the diagonal line, that is, must be different from for some k.
A robust estimate of L was proposed in [2], using iteratively re-weighted principal component analysis (IWPCA). This estimate of L, together with the above parametrization of a c and b c , give us estimates and of all 2K -2 parameters of a c and b c , as well as an estimate of e c .
Let us illustrate the parametrization and estimation procedure for K = 2. Since two (non-parallel) lines will always intersect, constraint (4) degenerates to the assumption that d c = 0 or, equivalently, that = e c In the noise-free case the line L is described by

Results
This analysis was done with eight arrays (A-H) that were scanned on two different scanners at four different PMT settings. Two different image analysis applications were used. See Methods for details.

Parameter estimates
For every combination of array, scanner and signal quantification method (image analysis or raw pixel intensities), we estimated the parameters a c (including e c and d c ), and b c in Model (2)-(4) for both channels (see Methods). To get a better understanding of the properties of the estimates, we used a non-parametric bootstrap approach to obtain not only bias corrected estimates, but also their standard deviations. Data was resampled over gene index in a way such that the same bootstrap data sets were used whenever pairwise comparison was done, e.g. when comparing bias estimates in red and green channels. For Gene-Pix and Spot quantified signals a bootstrap sample of size 100 was used.
For the estimates based on the raw pixel intensities a different approach was taken. Because the number of pixels for one scan is about 10 7 (per channel) and we had four scans, our computer system limited us to estimate the model based on a subset of 10 6 pixel intensities. This was done for 100 random subsets and the mean and standard deviation of the parameter estimates were calculated,  Table 4.

Comparison of arrays
The bias estimates for all bootstrap replicates in Tables 2  &3 have

Comparison of scanners
For the two scanners, we found that the estimated bias based on signals obtained by the Agilent scanner are consequently higher than the estimates from the Axon scanner. The box plots of their differences in the common bias e c (for each bootstrap sample) between the Agilent and the Axon scanner in Figure 3 confirm this divergence. See also Table 4. This significant difference could be an effect of scan order, that is, all arrays were first scanned on the Agi-lent scanner and then on the Axon scanner. The arrays in hand were part of a much bigger project based solely on Agilent scanned data. To keep a consistent scan protocol and to minimize bleaching, we could not balance the experimental design by letting some arrays be scanned in the reverse order. Instead, to test for scan order trends we scanned one array first on Agilent (H-1), then on Axon (H-1) and then again on Agilent (H-2). No apparent trend was found.

Comparison of image analysis methods
Estimates of the common bias e c based on GenePix quantified signals are consistently greater than the corresponding ones based on Spot signals, cf. Tables 2,3,4. The box plots in Figure 4 show differences in estimates of the common bias (for each bootstrap sample) between Gene-Pix and Spot. The difference may be explained by the fact that the two applications use different spot segmentation algorithms [8,9]. Because the concentration of fluorophores is not homogeneous across a spot, the result is that the distribution of pixel intensities will vary with the segmentation method. This effect can be more profound for spots with strong donut effects. Robust estimates such as the median pixel value will to some extent protect against this variance, but not completely. It has been suggested [10] that the median of (pixel) ratios is a better estimate of the ratio of hybridized cDNA than the ratio of median (pixels). However, the former requires that the images are perfectly aligned with respect to shift, rotation, shear and so on. Also, it applies exclusively to two sample comparisons. Because of this, we do not believe that pixel-ratio signals are useful in practice.

Pixel-based estimates
To better understand the underlying reasons for the observed channel biases, the proposed affine model was also applied to pixel intensities (instead of spot signals). The estimated biases for the two channels for different arrays using IWPCA based on pixel values are shown in Tables 2 &3. Except for the green channel in the second scan round of Array H, the pixel-based estimates are consistently higher than the estimates based on GenePix and the Spot foreground signals. As noted above, pixel-based estimates are very sensitive to image distortions. This is  especially a concern for the Agilent scanner since it reloads the arrays between subsequent scans. To investigate the effect of image distortion, we did a test where a person with experience in microarray analysis was asked to subjectively rank how badly aligned the four images in the red channel with different PMT gains from the Agilent scanner Bias estimated for each array and channel on both scanners

Axon4000A and Spot
were for each of the (unlabeled) nine arrays. The person rated the images from Arrays A, B, D, and H-1 to be "extremely" misaligned. The images from Array E were considered to be "quite" misaligned, and the images from Array C to be "slightly" misaligned. For the rest of the arrays the images were considered to be aligned (less than a pixel off). This is perfectly in line with the discrepancies in Table 2, which confirms our hypothesis. Another disadvantages with pixel-based methods is that they are extremely memory and time consuming. For instance, estimation with 10 6 pixels is approximately 50 times slower than with 55,488 signals.

Comparison of channels
As Figure 5 shows, the common bias e c is greater in the green channel than in the red channel, especially for GenePix quantified signals, when estimated based on data from the Axon scanner. For the Agilent scanner this trend is less clear although the Spot quantified signals seem to give higher bias in the green channel than in the red channel. Furthermore, the biases in the red and the green channels are stable between arrays, which give further evidence to our hypothesis that the bias originates from the scanner (and/or the image analysis methods).

Deviation in bias estimates between PMT gains
In Figure 6 the distribution of the "bias residuals" are depicted for different scans k and channels c, for each separate array, but also for all arrays together, and for both scanners and both image analysis methods. Most apparent is that the estimates based on signals from the Axon scanner and especially those quantified by the Spot software are greater than for the others, cf. Tables 2 &3. The reason for this difference is not clear to us. For some arrays the estimates from the red and the green channels are strongly correlated, but it is not clear to us when this Figure 3 Comparison of bias estimates between scanners. Differences in the common biases e c between the Agilent and the Axon scanners for each separate array and for all arrays together. The bias is significantly larger for Agilent, regardless of channel (top and bottom) and image analysis method (left and right). occurs. Although not in general, for some combinations of scanner and image analysis method, there is a trend in the PMT order (or possibly scan order). Again, we do not know why. To summarize, we have by means of exploratory data analysis (not shown) tried to understand what sometimes looks like patterns in the :s, but we found no apparent relationships. However, systematic effects indicate that may be modeled further.

Calibration
When data was calibrated according to the backward transformation in (8)-(10) estimates (up to a scale factor) of all x c,i :s were obtained. Since we do not know the true values we can not verify the estimates directly. However, partly we can do it indirectly by looking for remaining sys-tematic effects in the log-ratios, but also by comparing the empirical densities of the calibrated scans. For a detailed study on systematic effects introduced by affine transformations, see [2]. For instance, the amount of intensity-dependent curvature in the log-ratios is related to the bias and the relative scale factor via the product assuming ||d c || = 0. To demonstrate this relationship, we have for different PMT pairs compared the within-channel log-ratios and log-intensities   Figure 7. The corresponding plot for the backward transformed signals is shown to the very right. For each of the six data clouds, the curvature, but also the overall bias, in the log-ratios is removed. To further underline the effect that a channel-specific bias has, we have calculated the logratios for the bias-subtracted signals (no rescaling), which makes Model (2) linear. As seen in the middle scatter plot, the curvature introduced by the bias and the logarithm is removed. The overall bias in the log-ratios which remains is and is removed when the signals are rescaled. It is not correct to shift only the log-ratios towards zero, because then the log-intensities will be incorrect.

Comparison of bias estimates between image analysis methods
The various M versus A plots become very similar and so do the four empirical density functions of the signals as seen in Figure 8. The small bumps at high intensities are due to the saturated signals, cf. Figure 7.

Extended dynamical range
For the Agilent scanner the effective scale parameters / were estimated to be in the order of approximately 1 Comparison of bias estimates between channels Figure 5 Comparison of bias estimates between channels. Differences in the common biases e c between the red and the green channel for each separate array and for all arrays together. At the top is data from the Agilent scanner and at the bottom from the Axon scanner. Data in the left column by GenePix and Data in the right column was quantified by Spot. For data from the Axon scanner the common bias is greater in the green channel than the red channel, especially for GenePix quantified signals. For the Agilent scanner this trend is less clear although the Spot quantified signals clearly seem to have a higher bias in the green than the red channel.

Axon4000A and Spot
Bias residual estimates for all arrays Figure 6 Bias residual estimates for all arrays. Estimated bias residuals for each array and for all arrays together from the Agilent scanner (left column) and the Axon scanner (right column). For each array the distribution of the four of are shown in increasing PMT order. The top four graphs are for the red channel and the bottom four are for the green channel. For each channel the top two are estimates based on signals quantified by GenePix and the bottom two on signals quantified by Spot.  Table 1. Thus, the calibration method extends the effective dynamical range, with preserving linearity, by a factor of 3.5 for signals from the Agilent scanner and a factor of 40 for signals from the Axon scanner.

Sources of the bias
Because bias introduced before the PMT would be amplified differently at different gains, we suspect that the observed bias is due to the scanner and most likely its detector parts such as the analogue-to-digital converter (ADC) after the PMT, but possibly also due to the image analysis method. The observed differences between the channels can be explained by the fact that there is one PMT and one ADC per channel, which may have slightly different properties. Although there are differences in bias between the two scanners, they are still of the same order, which we find remarkable. Another lab with a GenePix scanner reported biases also around 15-20 (personal communication). A possible reason for this is that the scanners consist of similar parts.
Log-ratio and log-intensity plots of raw, translated, and calibrate signals Figure 7 Log-ratio and log-intensity plots of raw, translated, and calibrate signals. The affine transformation gives curvature in the M versus A plots, which is corrected for by the affine normalization method. The three scatter plots show the within-channel log-ratio versus the log-intensity for each of the six PMT pairs based on the same data as in Figure 1

Other estimates
To rule out the obvious situation where all pixel intensities are biased we compared the above estimates with the minimum pixel intensities. For example, for Array A (scanned on Axon and analyzed with GenePix Pro), the minimum pixel intensities in the red channel were 9, 0, 8, and 9 for PMT 500, 600, 700 and 800 volts, respectively. In the green channel the minimum pixel intensity is 0 for all scans. It is not useful to use the minimum spot signals, , either. For example, for the above scan the average minimum signal across all scans in the red channel is 19.8 (median 19.5, std. dev. 0.96), but in the green channel it is 34.8 (median 28.0, std. dev. 19.6), cf. Table 3.

On background subtraction
If the scanner is the main source for the observed bias, then the background estimates should be affected by this bias as well and subtracting the background from the foreground estimates will therefore not only correct for physical background noise from the array itself, but also for the scanner bias. The strong intensity dependent effects of the log-ratios that are due to the bias, are much less apparent if we apply background subtraction (not shown), giving more evidence to our hypothesis that the observed systematic effects originate from the scanner. Thus doing background correction might correct for the bias, but it will also introduce more noise at any given intensity. Also, for the data set in hand background subtraction results in 4050 (7.3%), 6237 (11%), 7015 (13%) and 7349 (13%) negative values (in either channel), respectively, whereas bias subtraction results in no negative values. If we assume that the noise is additive such that the background is added to the foreground signals, then for probes with few or no fluorescent molecules the true foreground signal should be close or identical to the true background signal. As both are estimates, approximately half of the foreground signals for non-signal spots are less or equal to the corresponding background signals. Thus, about half of such spots results in negative signals. However, the different numbers of negative signals for different PMT voltages suggest that this can not be the full explanation. One reason could be that the background estimates are likely to be biased [9]. An error model that incorporates different noise sources, but also different scan parameters, might give some answers to this problem. Some models in this context have already been suggested [3,7,11], as well as models based on empirical Bayesian methods [12]. Another way to put it is that the background estimate is local and based on individual spots/pixels whereas the bias estimate is global, that is, there is one estimate for the whole array (although local estimation of bias is possible). Therefore, the background subtracted intensity estimates are noisier, resulting in more negative estimates for low intensity spots.
The problem of non-positive estimates, but also high variance close to zero, are limitations of the logarithmic transform and alternatives such as the generalized logarithmic transform etc. have been suggested [7,13,14].

Photo bleaching
We estimated the red dye (Cy5™) to bleach about 2% and the green dye (Cy3™) about 1% in a typical microarray experiment (not shown). Because the amount of bleaching is relatively small, but also because it is a very complex phenomenon, we decided to not try to incorporate it in the above model. Some of the systematic variation seen in the bias estimates for the different PMT settings may be due to bleaching.

Signal density normalization
As the results show, the empirical distributions of signals match each other remarkably well after calibration. It is interesting to compare this method with the quantile normalization methods proposed by [15][16][17]. The latter is based on the "statistical" assumption that the signals in all channels (scans) should be equal whereas the former is based on a "physical" assumption that the signals should be linear in the dynamical range. For a further discussion on this see [2].

Incremental robust estimates
It turned out to be infeasible to estimate the model parameters based on all pixel intensities, which limited us to use only on a 10% subset of data. As argued above, pixel-based estimates are not reliable and therefore not of interest. However, for spot-based estimates the same limitations may apply as larger data sets are made available. We wish to overcome such memory constraints. For this reason, we investigate the possibility to use (approximative) incremental re-weighted PCA methods [18,19].

Related work
Another method that combines multiple scans is the masliner (Microarray Spot LINEar Regression) algorithm [20]. It works by combining one low-PMT scan and one high-PMT scan into a new virtual scan. If a signal in the high-PMT scan is within a specified linear range its value is used, otherwise the corresponding signal from the lowintensity range is used after being transformed affinely to fit the high-PMT scan. To combine three or more scans, the new virtual scan can be combined with another PMT scan and so on. The result is that the effective dynamical range is extended. However, there are several unnecessary drawbacks. First, although several observations of the same spot concentration exist, which all may be within the dynamical range of the scanner, only one observation is used. Statistically, the average (calibrated) scan would be a more precise estimate. Second, since the scans are combined pairwise the estimate of the affine relationship between the scans is less robust. Third, although a sensitivity discussion is carried out in the supplementary materials, masliner fits the affine models in a non-robust fashion (in L 2 ). Also, classical linear regression is used, which assumes no error in the explanatory variable. Since masliner makes the signals from different PMT settings proportional to each other it will indeed remove for instance curvature in within-channel M versus A scatter plots. However, masliner does not model the possibility of a PMT-independent bias and will therefore not correct for it. We believe this is the reason why the authors observe a "curvilinear effect" [ [20], supplementary material]. For these reasons, we believe that the robust multiscan calibration method presented in this paper is superior to the masliner algorithm and should be used instead.

Conclusions
By scanning the same microarray at various PMT settings we have shown that there exists a bias in the measurement of the concentration of fluorescent molecules in the spots on the microarray. Our analysis indicates that this bias is mainly due to the scanner, but also due to the image analysis methods. By using a constrained affine model for the relationship between the obtained fluorescent intensities and fluorophore concentrations in the spots, we have been able to estimate the aforementioned bias. With estimates of the bias and scale parameters in each channel back transformation gave estimates of the amount (up to a scale factor) of photons from each spot that enters the PMT. Although not all photons originate solely from fluorophores in the target DNA, this is still a far better estimate of the amount of hybridized target DNA in each spot than the corresponding signal quantified by the scanner and the image analysis.
Before calibration, our data show a strong intensity dependent effect in the log-ratios, whereas after calibration there is no apparent intensity dependent trend. Furthermore, the distributions of signals from subsequent PMT scans are almost identical after calibration. In addition, the signal-to-noise ratio is increased with multiple scans. Finally, scanning at both low and high PMT settings extends the dynamical range of data, which gives higher resolution at low intensities without having to pay the price of saturated signals.
The proposed method can be applied to other microarray technologies such as single-channel oligonucleotide arrays or nylon arrays, and possibly to other gene expression technologies such as quantitative real-time polymerase chain reaction (QRT-PCR).
To conclude, we suggest that hybridized microarrays are scanned at two (preferably more) PMT gain levels to identify channel dependent bias terms. Knowing the exact PMT settings is not important, but the larger the differences are, the more precise the estimates will be. We recommend that the scans are done in decreasing PMTgain order (although we did not do so here). Given estimates, data can then be calibrated easily. For practical reasons it might, however, be sufficient to estimate bias terms for a specific scanner once and then use estimates for calibration of subsequent microarrays. The small interarray variation observed for channel specific bias in our data suggests that this would be possible. On the other hand, without multiple scans, afore mentioned increase in signal-to-noise and dynamical range will be lost. Also, not investigated within the scope of this study, bias terms for a specific scanner might change over time. For these reasons, we suggest that microarrays are scanned multiple times.
For two-channel microarrays, after calibrating each channel separately, a similar strategy can be applied once more to bring differently labeled channels to the same scale as suggested in [2]. This would rely on the assumption that the amounts of hybridized DNA in all channels are approximately equal for the majority of the spots, which in turn is based on the commonly used assumption that most genes are non-differentially expressed. This also applies to normalization between arrays.
All necessary methods are made available in a free R package named aroma [21]. A typical usage is calibrateMultiscan(rg) where rg is the object containing the red and green signals. In addition, we are currently implementing the methods as a plug-in module for the BASE system [22].

Arrays and hybridization
The analysis was based on eight different hybridizations of spotted oligonucleotide microarrays (A-H  Instruments) at PMT gains 600, 700, 800, and 500 volts (in that order), except for Array A, which was scanned at 700, 800, 500 and 600 volts, and Array H, which was scanned at 600, 400, 500 and 700 volts. Thus, the images obtained by the Axon scanner were bleached more than the preceding ones obtained by the Agilent scanner. For both scanners the power of the 532 nm and the 635 nm lasers was set to 100% and the scan resolution to 10 µm/pixel. Moreover, a one-pass (both channels scanned simultaneously) and one-sample-per-pixel ("lines to average" equals one) procedure was used. The Agilent scanner has a special loading mechanism for microarrays, which allows automatic scanning of subsequent arrays without human intervention. However, due to limitations in the software or the scanner, each batch of arrays can only be scanned at a single PMT gain. To scan at more PMT gains with the Agilent scanner, it was therefore necessary to eject and reload the arrays between different settings, which means that the alignment between the scanned images may not be perfect. Contrary, for the Axon scanner the arrays were put in the scanner one by one, then scanned at all PMT settings without being moved.

Image analysis -spot segmentation and registration
To quantify the foreground and the background signals, the scanned images (65536 gray scales and approximately 2000-by-5600 pixels) were analyzed using both the Axon GenePix Pro v4.1.1 software (Axon Instruments) and the Spot v2 software [8,24]. We first analyzed each image with GenePix. For each of them, the grid and spot positions were manually set and then the alignment was optimized by GenePix. These positions were then re-entered and reoptimized by Spot with visual inspection to verify the correctness. Moreover, for each individual scan the image analysis software was let to find the optimal spot segmentation. Thus, what is defined as a foreground pixel may vary with PMT setting although the images are from the same array. We decided on this schema for various reasons. The first reason was that the Agilent arrays are loaded and unloaded between subsequent scans and therefore require a separate spot segmentation. To be able to compare the results from the Axon and the Agilent scanner we choose the same procedure for the images scanned on the Axon scanner, even though, the optimized segmentation for the strongest image could have been reused. We further believe that this allows us compare Spot and GenePix more fairly.
For both Spot and GenePix the median spot pixel intensity was used as foreground signal. Background estimates were not considered in this analysis. No spot signals were discarded.

Calibration
Given estimates of and data can be calibrated using backward transformation. Let be the backward transformed observed signal and the rescaled error terms, respectively. The affine Model (2)  is independent of PMT setting, that is .
Assuming independent noise terms, the variance of the sample mean decrease with K as In summary, we obtain consistent estimates (up to a multiplicative constant) of all x c,i with increasing I and K.
Finally, signals that are saturated by the scanner have to be excluded before calculating the average. If the quantified signal for a spot happens to be saturated in all scans, then that spot is marked as saturated, which still may be informative when compared to other non-saturated signals.

Data analysis
All further analysis was carried out using R [25,26] and the aroma package (f.k.a. com.braju.sma) [21]. All methods used can be found in the latter.