- Research article
- Open Access
Segmentation and intensity estimation for microarray images with saturated pixels
- Yan Yang^{1}Email author,
- Phillip Stafford^{2} and
- YoonJoo Kim^{3}
https://doi.org/10.1186/1471-2105-12-462
© Yang et al; licensee BioMed Central Ltd. 2011
- Received: 11 August 2011
- Accepted: 30 November 2011
- Published: 30 November 2011
Abstract
Background
Microarray image analysis processes scanned digital images of hybridized arrays to produce the input spot-level data for downstream analysis, so it can have a potentially large impact on those and subsequent analysis. Signal saturation is an optical effect that occurs when some pixel values for highly expressed genes or peptides exceed the upper detection threshold of the scanner software (2^{16} - 1 = 65, 535 for 16-bit images). In practice, spots with a sizable number of saturated pixels are often flagged and discarded. Alternatively, the saturated values are used without adjustments for estimating spot intensities. The resulting expression data tend to be biased downwards and can distort high-level analysis that relies on these data. Hence, it is crucial to effectively correct for signal saturation.
Results
We developed a flexible mixture model-based segmentation and spot intensity estimation procedure that accounts for saturated pixels by incorporating a censored component in the mixture model. As demonstrated with biological data and simulation, our method extends the dynamic range of expression data beyond the saturation threshold and is effective in correcting saturation-induced bias when the lost information is not tremendous. We further illustrate the impact of image processing on downstream classification, showing that the proposed method can increase diagnostic accuracy using data from a lymphoma cancer diagnosis study.
Conclusions
The presented method adjusts for signal saturation at the segmentation stage that identifies a pixel as part of the foreground, background or other. The cluster membership of a pixel can be altered versus treating saturated values as truly observed. Thus, the resulting spot intensity estimates may be more accurate than those obtained from existing methods that correct for saturation based on already segmented data. As a model-based segmentation method, our procedure is able to identify inner holes, fuzzy edges and blank spots that are common in microarray images. The approach is independent of microarray platform and applicable to both single- and dual-channel microarrays.
Keywords
- Gaussian Mixture Model
- Spot Intensity
- Microarray Image
- Signal Saturation
- Saturation Threshold
Background
Microarray technology has been used in many areas of biomedical research and drug development to study the function of thousands of genes in a single experiment. As an important early step in microarray studies, microarray image analysis produces the input spot intensity data to downstream analysis such as classification and identification of differentially regulated genes. Thus, image processing can have profound effects on those and subsequent analysis. Microarray images with saturated hybridization signals are common when the dynamic range of expression of biological quantities is large. To enhance weak signals, a common practice is to increase the photometric gain at scanning. A large gain, however, often causes some pixels for highly expressed genes to exceed the scanner's upper limit of detection. Discarding spots with saturated pixels can fail to detect target genes that are highly and differentially expressed, whereas using saturated values without correction tends to underestimate expression levels and distort high-level analysis [1].
Correcting saturation-induced bias for expression microarrays has triggered much of recent research. Early work involves combining spot intensity data from multiple scans, obtained at different laser power or photomultiplier tube (PMT) settings, into an extended linear range and estimating expression levels beyond the saturation threshold by extrapolation [2–4]. Wit and McClure [5] proposed a maximum likelihood (ML) approach in which censoring was incorporated to account for signal saturation using the mean, median and variance of each spot. Dodd et al. [6] developed a censored Gaussian regression model by exploiting the association between pixel intensities of two channels in dual-dye experiments. Along the line of saturation adjustments by censoring, Ekstrom et al. [7] considered parametric spot shape/profile models for pixel-level data and imputed the values for saturated pixels. A Cauchy distribution was employed in Khondoker et al. [8] to model spot intensity data with outlying observations from multiple scans. To account for saturation, the Cauchy location function was specified to follow the functional form of the mean of a censored Gaussian. Glasbey et al. [9] imputed censored pixel values based on the principal components of uncensored spots. Bayesian hierarchical modeling for handling signal saturation using data from multiple scans was considered by Gupta et al. [10] and Gupta et al. [11].
The use of spot-level data in most previous work is largely motivated by the ready access to such data through standard output files of image processing software. However, signal saturation occurs at individual pixels that form a spot. So using raw pixel values could potentially provide more effective bias adjustments. We base our analysis on pixel-level data in this article. Instead of accounting for saturated pixels in isolation from image segmentation, as has been done in previous work reviewed earlier, we propose to combine model-based segmentation with spot intensity estimation to correct for saturation at the segmentation stage. In mixture model-based clustering of pixels, pixel values are typically assumed to follow a finite mixture of parametric distributions such as Gaussian [12–15]. When a portion of pixel values are saturated, the distribution assumption should be modified to reflect this feature. Consequently, cluster memberships of the pixels may be altered. Thus, accounting for saturation during image segmentation has the potential to improve the accuracy of segmentation, which in turn would lead to more effective spot intensity estimation.
Yang et al. [16] and Li et al. [14] provided excellent reviews on methods for segmenting microarray images without saturation. As a histogram-based segmentation method, mixture model-based clustering of pixels has a few advantages. Not only can it accurately recover irregularly shaped spots, such as commonly seen donut-shaped spots [17], but it can also identify blank spots and spots with fuzzy edges. Fuzzy edges are a bigger concern for saturated spots, since the optical flare caused by extremely bright pixels often distorts the local background estimate. In this article, we propose a model-based image segmentation and spot intensity estimation procedure to correct for signal saturation at the pixel level. A censored Gaussian mixture model (GMM) with no more than three mixture components is developed, in which the number of components is selected based on information criteria. The expectation-maximization (EM) algorithm is carried out for model estimation and implemented in R. Before applying the proposed method, it is necessary to perform automatic gridding (i.e., locating the spots on an array) to provide the data for segmentation. Since the arrays in our data examples were spotted in an orange-crate packing pattern to increase spot density, a hexagonal grid was used. To facilitate high-throughput analysis, we provide Matlab code that extracts the pixel intensity values and the coordinates of the pixels belonging to each spot after automatic gridding is done. Through microarray examples and simulation, we demonstrate that our method extends the dynamic range of measured expression and is effective in correcting saturation-induced bias. We also illustrate the influence of saturation adjustments on modifying clustering results and the impact of image processing on downstream classification. Source code and data are available at http://math.la.asu.edu/~yy/cgmm.html.
The rest of this article is organized as follows. First, the censored GMM is introduced for segmenting microarray images with saturated pixels and estimating spot-level intensities. An EM algorithm is used to estimate the model. Next, we illustrate the proposed method with peptide microarray images from a human Valley Fever diagnosis study and a canine lymphoma diagnosis study, and compare our method with regular GMM-based segmentation and intensity estimation without saturation correction. We conclude with the main findings and comment on future research.
Methods
Microarray image processing involves three main steps: 1) gridding or addressing that finds the exact location of each spot, 2) image segmentation that determines which pixels form the signal and which pixels form the background, and 3) intensity estimation that quantifies the expression level for each spot. The proposed censored GMM focuses on the second and third tasks to correct for signal saturation. The raw data are 16-bit grayscale images stored as TIFF files. The input data for our segmentation and intensity estimation procedure are pixel intensity values belonging to individual spots after automatic gridding is done. In the following we present the microarray studies that produced the images for the data examples, automatic gridding and pixel intensity extraction, and model-based segmentation and spot intensity estimation.
Microarray data
We considered a microarray platform that is still in its infancy but has profound implications for health monitoring and pre-symptomatic disease detection. The immunosignaturing microarray is composed of 10,000 unique, random-sequence peptides that are printed in standard microarray format. The peptide probes detect antibody changes in the serum samples and correlate them with changes in health status. Microarray images were obtained from an experiment on Valley Fever (Coccidioidomycosis) diagnosis in humans and an experiment detecting T-cell and B-cell lymphoma in dogs. Both studies were conducted at the Biodesign Institute of the Arizona State University using the Immunosignaturing Arrays developed by the Center for Innovations in Medicine [18, 19]. Human subjects were consented and de-identified according to IRB Protocol# 0905004024. In the Valley Fever diagnosis study, 60 patients with the disease and 30 healthy controls were examined. The lymphoma diagnosis study examined the serum samples from 21 dogs with either B-cell or T-cell lymphoma and 20 healthy dogs.
The microarrays were spotted in an orange-crate packing pattern using piezo-electric deposition of 10 pL of 1 mg/ml peptide in 20 uM Hepes buffer, 10 uM EDTA, 5 uM TCEP, pH 6.7 (Applied Microarrays, Tempe, AZ). In both experiments, the serum samples were diluted 1:500 in incubation buffer and allowed to incubate on the microarray slide. A secondary antibody pre-labeled with either Alexafluor 647 or Alexafluor 555 (Invitrogen, Carlsbad, CA) was added to the solution to detect the primary antibodies. The slides were washed, dried, and scanned at 645 nm or 550 nm (according to the dye) using an Agilent 'C' laser scanner (Agilent, Santa Clara, CA).
Automatic gridding and pixel intensity extraction
Model-based segmentation and intensity estimation
where 0 ≤ π_{ k } ≤ 1, k = 1, ..., K, are the mixing weights that sum to one, ϕ(·;μ_{ k } , σ_{ k } ) is the normal density function with mean μ_{ k } and standard deviation σ_{ k } , ϕ_{ S } (·; μ_{ K } , σ_{ K } ) is the density function of a right-censored normal at S with mean μ_{ K } and standard deviation σ_{ K } , and θ= (π_{1}, ..., π_{K - 1}, μ_{1}, ..., μ_{ K } , σ_{1}, ..., σ_{ K } ) ^{ T } is the vector containing all parameters. For parameter identification, let μ_{1}< μ_{2} < ... < μ_{ K } . For model-based clustering of pixels and spot intensity estimation, we maximize the log-likelihood function $l\left(\mathit{\theta};\mathbf{y}\right)={\sum}_{i=1}^{n}logf\left({y}_{i};\mathit{\theta}\right)$, where y = (y_{1}, ..., y_{ n } ) ^{ T } .
where ${l}_{0}\left({\pi}_{1},{\pi}_{2};\mathbf{z}\right)={\sum}_{i=1}^{n}\left({z}_{i1}log{\pi}_{1}+{z}_{i2}log{\pi}_{2}+{z}_{i3}log{\pi}_{3}\right)$, ${l}_{1}\left({\mu}_{1},{\sigma}_{1};\mathbf{y},\mathbf{z}\right)={\sum}_{i=1}^{n}{z}_{i1}log\varphi \left({y}_{i};{\mu}_{1},{\sigma}_{1}\right)$, ${l}_{2}\left({\mu}_{2},{\sigma}_{2};\mathbf{y},\mathbf{z}\right)={\sum}_{i=1}^{n}{z}_{i2}log\varphi \left({y}_{i};{\mu}_{2},{\sigma}_{2}\right)$, and ${l}_{3}\left({\mu}_{3},{\sigma}_{3};\mathbf{y},\mathbf{z}\right)={\sum}_{i=1}^{n}{z}_{i3}log\phantom{\rule{0.3em}{0ex}}\left[I\left({y}_{i}<S\right)\varphi \left({y}_{i};{\mu}_{3},{\sigma}_{3}\right)+I\left({y}_{i}=S\right)\left\{1-\Phi \left(\frac{S-{\mu}_{3}}{{\sigma}_{3}}\right)\right\}\right]$; I(A) = 1 if event A occurs and equals 0 otherwise, and Φ(·) is the standard normal cumulative distribution function. It can be seen that the complete-data likelihood is decomposed into four parts: the log-likelihood l_{0} for a multinomial distribution on z, the weighted log-likelihoods l_{1} and l_{2} for two regular normal distributions on y with weights z_{1} and z_{2}, and the weighted log-likelihood l_{3} for a right-censored normal on y with weight z_{3}. The four likelihood parts, each with distinct parameters, can be maximized separately.
For the censored normal component, estimators for μ_{3} and σ_{3} do not have analytical forms and need to be solved iteratively. The Newton-Raphson algorithm [26] for estimating a censored normal model iteratively evaluates the first and second derivatives of the likelihood component l_{3}(μ_{3}, σ_{3}; y, z) with respect to the parameters. The method is implemented in standard software packages for survival analysis, such as the R survreg function in the survival library and the SAS LIFEREG procedure. Our R program implementation of the EM algorithm calls the survreg function in each M step to update the estimates of μ_{3} and σ_{3}.
To start the iterative procedure, we obtain the initial values of the mean and scale parameters using the sample means and standard deviations calculated from appropriate portions of ordered pixel values within a target mask. The initial mixing weights are estimated based on prior knowledge about the typical spot and target mask sizes. In the Valley Fever diagnosis study, the initial value for π_{1} was taken as 0.8 in a two-component mixture and the values for π_{1} and π_{2} were 0.7 and 0.1 in a three-component mixture. Convergence of the algorithm is monitored by evaluating the increase in the likelihood function l(θ; y). At convergence, the ML estimate ${\widehat{\mu}}_{1}$ of μ_{1} is used to quantify the BG noise for a spot. The spot signal is estimated by ${\widehat{\mu}}_{K}$, given that K ≥ 2. Consequently, the BG-corrected spot intensity is given by ${\widehat{\mu}}_{K}-{\widehat{\mu}}_{1}$ for K ≥ 2. For a refined selection of K, one may specify a threshold for the relative difference in BIC below which a simpler model with a smaller K, even if it has a slightly larger BIC value, is preferred over a more complex model. A similar idea was used in Baek et al. [15] to flag spots with low expression levels for K ≤ 2. The threshold value 0.001 for the relative difference in BIC was used in the data examples and the simulation study.
Availability
Source code and sample data are made available at http://math.la.asu.edu/~yy/cgmm.html. Source code includes the R program for implementing the censored GMM-based segmentation and spot intensity estimation, and the Matlab program for extracting pixel intensity values and associated coordinates belonging to individual spots from an automatically gridded image, with either a rectangular grid or a hexagonal grid. Microarray images from the human Valley Fever diagnosis study are accessible through Gene Expression Omnibus Series accession number GSE33899 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33899).
Results and discussion
In this section we present the results from two data examples and a simulation study. We compare our method with the regular GMM similar to the segmentation method of Li et al. [14] and the fixed circle segmentation implemented in GenePix 6.0 [21]. Microarray images from a Valley Fever diagnosis study in humans were processed to illustrate the capability of the proposed method to enhance the dynamic range of expression data beyond the saturation threshold. A canine lymphoma diagnosis study was used to demonstrate the impact of saturation adjustments at the segmentation stage on downstream classification between healthy and cancer tissue. A simulation study was also conducted to evaluate the selection of K and the performance of the censored GMM in correcting saturation-induced bias.
A human Valley Fever diagnosis study
Microarray images were obtained from a Valley Fever diagnosis study for identifying peptides that more effectively predicted the status of Valley Fever patients than a standard test. Each array consisted of 12 by 4 blocks and each block contained 22 by 22 spots, with 400-500 pixels per target mask. The hybridized arrays were scanned at 100% laser power and 70% PMT voltage, resulting in a few spots saturated at S = 65, 535.
Foreground median and mean intensities for three saturated spots
Spot | Saturated pixels | FG pixels | FG median (mean) | ||||
---|---|---|---|---|---|---|---|
GenePix | CGMM | GMM | GenePix | CGMM | GMM | ||
1 | 34 | 120 | 116 | 114 | 52548 (48565) | 57174 | 55538 |
2 | 60 | 120 | 78 | 60 | 59077 (42119) | 70460 | 65535 |
3 | 18 | 120 | 26 | 140 | 20607 (26909) | 74128 | 24738 |
Background median and mean intensities for three saturated spots
Spot | BG pixels | BG median (mean) | ||||
---|---|---|---|---|---|---|
GenePix | CGMM | GMM | GenePix | CGMM | GMM | |
1 | 556 | 249 | 250 | 5784 (5825) | 5845 | 5857 |
2 | 596 | 284 | 288 | 1619 (1992) | 2458 | 2546 |
3 | 671 | 348 | 349 | 1038 (1107) | 1243 | 1248 |
A canine lymphoma diagnosis study
Peptide microarrays from a canine lymphoma diagnosis study are used to illustrate the impact of saturation correction on downstream class prediction. The original images, obtained by scanning the arrays at 100% laser power and 70% PMT voltage, contained few saturated pixels, because most lymphoma tumors do not elicit a strong antibody response or the response is immunologically repressed. For the purpose of illustration, we censored the original pixel intensity values at artificial saturation thresholds to generate new data sets with desired saturation rates. An additional benefit of artificial censoring was to set a gold standard for comparing classification results, since in this case we know the original, uncensored data. In the analysis we included samples from 14 affected dogs and 7 healthy dogs. Each array consisted of a top sub-array and a bottom sub-array. A sub-array had 16 blocks, each containing 65 by 10 spots with around 400 pixels per target mask.
Median percentage of saturated foreground pixels
Spot | Median % of saturated FG pixels | |
---|---|---|
S _{ 1 } = 1000 | S _{ 2 } = 800 | |
1 | 3.3 | 27.9 |
2 | 4.9 | 50.0 |
3 | 10.9 | 69.4 |
4 | 26.5 | 72.2 |
Misclassification rate based on leave-one-out cross validation
Method | Data | TP | TN | FP | FN | Error rate |
---|---|---|---|---|---|---|
GenePix | Original | 12 | 4 | 3 | 2 | 0.24 |
GMM | Original | 13 | 3 | 4 | 1 | 0.24 |
CGMM | Censored at 1000 | 13 | 3 | 4 | 1 | 0.24 |
GMM | Censored at 1000 | 10 | 2 | 5 | 4 | 0.43 |
CGMM | Censored at 800 | 11 | 3 | 4 | 3 | 0.33 |
GMM | Censored at 800 | 9 | 0 | 7 | 5 | 0.57 |
Simulation
A simulation study was performed to investigate the use of information criteria on mixture component selection and the performance of our optimization algorithm implementation. A K-component GMM, K = 2 or 3 with one component right-censored at S = 65, 535, was used as the true model. At each K, the parameters were chosen to mimic the Valley Fever diagnosis study and allow 10%, 40% or 70% of saturated FG pixels. The number of pixels within a target mask was fixed at 500. 1000 simulated trials were run in R for each of the six settings. Each simulated data set was fitted by both the censored GMM and the regular GMM. The complete, uncensored data were also analyzed by the regular GMM. Akaike information criterion (AIC) and BIC were used to select K in each model.
Percent of times the mixture component was correctly selected
K | % of saturated FG pixels | GMM0 | CGMM | GMM1 |
---|---|---|---|---|
2 | 10 | 100.0 | 100.0 | 99.0 |
40 | 100.0 | 100.0 | 76.3 | |
70 | 100.0 | 100.0 | 82.7 | |
3 | 10 | 100.0 | 100.0 | 100.0 |
40 | 100.0 | 100.0 | 95.0 | |
70 | 100.0 | 99.5 | 68.9 |
Relative bias in the two-component mixture model
Parameter | GMM0 | CGMM | GMM1 | ||||||
---|---|---|---|---|---|---|---|---|---|
10 | 40 | 70 | 10 | 40 | 70 | 10 | 40 | 70 | |
π_{1} = 0.8 | 0.0006 | 0.0007 | 0.0006 | 0.0006 | 0.0006 | 0.0006 | 0.0026 | 0.0075 | 0.0035 |
μ_{1} = 8, 000 | 0.0003 | 0.0003 | 0.0003 | 0.0003 | 0.0003 | 0.0003 | 0.0007 | 0.0006 | 0.0014 |
σ_{1} = 2, 000 | -0.0043 | -0.0043 | -0.0044 | -0.0044 | -0.0044 | -0.0045 | -0.0010 | 0.0032 | 0.0073 |
μ _{2} | -0.0002 | 0.0000 | -0.0001 | -0.0001 | 0.0004 | -0.0019 | -0.0104 | -0.0867 | -0.2220 |
σ _{2} | -0.0079 | -0.0080 | -0.0075 | -0.0070 | -0.0068 | -0.0162 | -0.1121 | -0.3784 | -0.6426 |
Relative bias in the three-component mixture model
Parameter | GMM0 | CGMM | GMM1 | ||||||
---|---|---|---|---|---|---|---|---|---|
10 | 40 | 70 | 10 | 40 | 70 | 10 | 40 | 70 | |
π_{1} = 0.7 | 0.0011 | 0.0010 | 0.0011 | 0.0011 | 0.0011 | 0.0012 | 0.0006 | -0.0002 | -0.0088 |
μ_{1} = 2, 000 | 0.0015 | 0.0015 | 0.0015 | 0.0015 | 0.0015 | 0.0015 | 0.0015 | 0.0014 | 0.0033 |
σ_{1} = 1, 000 | -0.0044 | -0.0045 | -0.0045 | -0.0044 | -0.0045 | -0.0044 | -0.0051 | -0.0077 | -0.0117 |
π_{2} = 0.1 | -0.0022 | -0.0017 | -0.0023 | -0.0032 | -0.0032 | -0.0046 | 0.0176 | 0.1637 | 0.5294 |
μ_{2} = 15, 000 | 0.0008 | 0.0005 | 0.0004 | 0.0003 | -0.0001 | 0.0003 | 0.0117 | 0.1679 | 0.7195 |
σ_{2} = 6, 000 | -0.0262 | -0.0249 | -0.0255 | -0.0276 | -0.0271 | -0.0285 | 0.0100 | 0.3923 | 1.8881 |
μ _{3} | -0.0001 | 0.0000 | -0.0002 | -0.0002 | 0.0002 | -0.0010 | -0.0053 | -0.0384 | -0.1352 |
σ _{3} | -0.0071 | -0.0076 | -0.0071 | -0.0049 | -0.0028 | -0.0033 | -0.1161 | -0.4601 | -0.9294 |
Conclusions
In analysis of expression microarrays, the issue of signal saturation has been frequently neglected, causing downward bias in estimating spot expression levels and potentially distorting high-level analysis. Previous work has focused on saturation correction based on already segmented data, at either the spot or pixel level. In this article, we combine model-based segmentation and spot intensity estimation into an integrated procedure that has the potential to recover or partially recover the lost information on expression levels due to saturated pixels. The procedure models saturated pixels as right censored at the saturation threshold and is implemented in R for high-throughput analysis. As demonstrated in microarray examples and simulation, the proposed method extends the dynamic range of expression data at the high end, is effective in correcting saturation-induced bias at the pixel level, better identifies the cluster memberships of pixels, and has the potential to increase the predictive power for downstream class prediction. As a model-based segmentation method, our procedure can identify inner holes, fuzzy edges and blank spots that are common in microarray images. Although illustrated with single-dye peptide microarrays, the approach is independent of microarray platform and applicable to both single- and dual-channel microarrays. Our method does not need multiple scans of an image to achieve saturation correction as required by some of the existing methods.
Possible extensions to the current work include development of model-based image segmentation procedures that assume robust or other distributions for the foreground and background pixels [8, 15] and also account for signal saturation. The spatial information in a microarray image should also be exploited in different means to more effectively guide the model-based clustering of pixels.
Declarations
Acknowledgements
YY's research was supported by NSF grant DRL-0909630. The authors thank Dr. Mingjie Xu for technical help with Matlab, and two referees and the Associate Editor for valuable comments and suggestions.
Authors’ Affiliations
References
- Hsiao LL, Jensen RV, Yoshida T, Clark KE, Blumenstock JE, Gullans SR: Correcting for signal saturation errors in the analysis of microarray data. BioTechniques 2002, 32: 330–336.PubMedGoogle Scholar
- Dudley AM, Aach J, Steffen MA, Church GM: Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proceedings of the National Academy of Sciences 2002, 99: 7554–7559. 10.1073/pnas.112683499View ArticleGoogle Scholar
- Lyng H, Badiee A, Svendsrud DH, Hovig E, Myklebost O, Stokke T: Profound influence of microarray scanner characteristics on gene expression ratios: Analysis and procedure for correction. BMC Genomics 2004, 5: 10. 10.1186/1471-2164-5-10PubMed CentralView ArticlePubMedGoogle Scholar
- Garcia de la Nava J, van Hijum S, Trelles O: Saturation and quantization reduction in microarray experiments using two scans at different sensitivities. Statistical Applications in Genetics and Molecular Biology 2004, 3: Article 11.Google Scholar
- Wit E, McClure J: Statistical adjustment of signal censoring in gene expression experiments. Bioinformatics 2003, 19: 1055–1060. 10.1093/bioinformatics/btg003View ArticlePubMedGoogle Scholar
- Dodd LE, Korn EL, McShane LM, Chandramouli GVR, Chuang EY: Correcting log ratios for signal saturation in cDNA microarrays. Bioinformatics 2004, 20: 2685–2693. 10.1093/bioinformatics/bth309View ArticlePubMedGoogle Scholar
- Ekstrom CT, Bak S, Kristensen C, Rudemo M: Spot shape modelling and data transformations for microarrays. Bioinformatics 2004, 20: 2270–2278. 10.1093/bioinformatics/bth237View ArticlePubMedGoogle Scholar
- Khondoker MR, Glasbey CA, Worton BJ: Statistical estimation of gene expression using multiple laser scans of microarrays. Bioinformatics 2006, 22: 215–219. 10.1093/bioinformatics/bti790View ArticlePubMedGoogle Scholar
- Glasbey CA, Forster T, Ghazal P: Estimation of expression levels in spotted microarrays with saturated pixels. Statistical Applications in Genetics and Molecular Biology 2007, 6: Article 34.View ArticleGoogle Scholar
- Gupta R, Auvinen P, Thomas A, Arjas E: Bayesian hierarchical model for correcting signal saturation in microarrays using pixel intensities. Statistical Applications in Genetics and Molecular Biology 2006, 5: Article 20.View ArticleGoogle Scholar
- Gupta R, Greco D, Auvinen P, Arjas E: Bayesian integrated modeling of expression data: A case study on RhoG. BMC Bioinformatics 2010, 11: 295. 10.1186/1471-2105-11-295PubMed CentralView ArticlePubMedGoogle Scholar
- Chen Y, Dougherty ER, Bittner ML: Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics 1997, 2: 364–374. 10.1117/12.281504View ArticlePubMedGoogle Scholar
- Glasbey CA, Ghazal P: Combinatorial image analysis of DNA microarray features. Bioinformatics 2003, 19: 194–203. 10.1093/bioinformatics/19.2.194View ArticlePubMedGoogle Scholar
- Li Q, Fraley C, Bumgarner RE, Yeung KY, Raftery AE: Donuts, scratches and blanks: robust model-based segmentation of microarray images. Bioinformatics 2005, 21: 2875–2882. 10.1093/bioinformatics/bti447View ArticlePubMedGoogle Scholar
- Baek J, Son YS, McLachlan GJ: Segmentation and intensity estimation of microarray images using a gamma-t mixture model. Bioinformatics 2007, 23: 458–465. 10.1093/bioinformatics/btl630View ArticlePubMedGoogle Scholar
- Yang YH, Buckley MJ, Dudoit S, Speed TP: Comparison of methods for image analysis on cDNA microarray data. Journal of Computational and Graphical Statistics 2002, 11: 108–136. 10.1198/106186002317375640View ArticleGoogle Scholar
- Pappaert K, Ottevaere H, Thienpont H, van Hummerlen P, Desmet G: Diffusion limitation: A possible source for the occurrence of doughnut patterns on DNA microarrays. BioTechniques 2006, 41: 609–616. 10.2144/000112293View ArticlePubMedGoogle Scholar
- Legutki JB, Magee DM, Stafford P, Johnston SA: A general method for characterization of humoral immunity induced by a vaccine or infection. Vaccine 2010, 28: 4529–4537. 10.1016/j.vaccine.2010.04.061View ArticlePubMedGoogle Scholar
- Halperin R, Stafford P, Johnston SA: Exploring antibody recognition of sequence space through random-sequence peptide microarrays. Molecular and Cellular Proteomics 2010. [First Published on November 9, 2010 as doi:10.1074/mcp.M110.000786] [First Published on November 9, 2010 as doi:10.1074/mcp.M110.000786]Google Scholar
- Drăghici S: Data Analysis Tools for DNA Microarrays. Chapman & Hall/CRC; 2003.View ArticleGoogle Scholar
- Molecular Devices Corp: GenePix Pro 6.0, User's Guide and Tutorial. Molecular Devices, Corp; 2005.Google Scholar
- McLachlan G, Peel D: Finite Mixture Models. John Wiley & Sons; 2000.View ArticleGoogle Scholar
- Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society 1977, B39: 1–38.Google Scholar
- McLachlan G, Krishnan T: The EM Algorithm and Extensions. Chapman & Hall/CRC; 2008.View ArticleGoogle Scholar
- Yang Y, Simpson DG: Unified computational methods for regression analysis of zero-inflated and bound-inflated data. Computational Statistics and Data Analysis 2010, 54: 1525–1534. 10.1016/j.csda.2009.12.012PubMed CentralView ArticlePubMedGoogle Scholar
- Thisted RA: Elements of Statistical Computing: Numerical Computation. Chapman and Hall/CRC; 1988.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.