Data
Psoriasis is thought to be due to an overly active immune system [10, 11]. To study how the immune response of leukocytes isolated from blood can be affected by drugs that may serve to control autoimmune diseases like psoriasis, blood was drawn from five volunteers under a protocol that had been approved by The Rockefeller University Hospital Institutional Review Board [12].
For each subject, peripheral blood mononuclear cells (PBMCs) were isolated and cultured in six Petri dishes. Four cultures were activated with an anti CD3/CD28 antibody, two of which were pre-treated with a repressor drug. Two cultures served as control without drug or activation. One of the two sets of control, activated, and pre-treated cultures (subject 1 and 2) was analyzed after 6, the other after 24 hrs. (For subjects 3, 4, and 5, only one time point is available.) All samples were hybridized to Affymetrix HuU95av2 chips.
Artefacts identified on probe-level (CEL) files
Figure 1 displays the six chips obtained from one subject's PBMC sample. This subject was chosen because one of the chips (upper row, centre) exhibits a variety of blemishes, which are discussed below, see Figure 2b: a 'bright spot' in the upper-right corner, a 'dark spot' in the upper centre, 'dark clouds' in the upper and lower right centre, and two 'shadowy circles' reaching beyond the left border. Part of the upper circle is included in the chip portion depicted in Figure 1.
Similar results were obtained for all subjects (data not shown). None of the artefacts would have been detected by visual inspection of the pseudo image (Figure 2a). Even after having seen the filtered image, most blemishes are difficult to identify at best. Interestingly, some chips appear to have a preponderance of specific artefacts, suggesting that at least some of the blemishes are caused by specific environmental factors during hybridization, and providing the first indication for the validity of the proposed method. The chip used as the background in Figure 3 has 'dark clouds' in the upper left corner and, albeit to a lesser degree, in both lower corners. Of the two chips with several smaller artefacts, one had three spots that resemble the 'dark spot' in Figure 2. Only the bright scratch at the bottom of one of the chips could have been detected by mere visual inspection of the chip, although even this chip passed the Rockefeller University's Gene Array Resource Center's quality control.
Average vs. median in the filtering procedure
The proposed filtering process relies on identifying deviations of a probe on one chip from a measure of central tendency for this probe across chips. Thus, if few chips have high intensity 'outliers' for one probe, the chips with normal intensities may appear to be negative 'outliers'. One would expect that the six-chip filter is less likely to generate such 'ghosting' artefacts than the three chip filter. We compared the use of medians vs. arithmetic means as the reference. As we had predicted based on the understanding that errors are more likely to be outliers than white noise, using medians not only resulted in less 'ghosting', but also in fewer isolated cells being considered artefacts and, thereby, better contrast (Figure 4).
Validation of probe-level artefacts by going back to the pixel-level image
Our method allows us to identify spatially correlated regions that are unlikely to originate from random fluctuations. To demonstrate that the statistical anomalies detected in the pseudo images at the probe level (Figure 2 and Figure 3) are, in fact, physical blemishes, we inspected the corresponding raw image at the pixel level. The regular artefacts seen (shadow, circle, cloud, etc.) are clearly blemishes, even if the precise nature of the physical blemish may not be known. Still, the difference in features between blemishes suggests different causes.
A number of factors are known to cause bright or dark spots in fluorescence micrographs. Dust on the front cover slip will cause a dark, out-of-focus shadow. Common white paper is bleached with strongly fluorescent dyes, so fibres from tissue paper ordinarily used for cleaning cause intense glare. Many organic solvents, detergents, and other chemicals will fluoresce when concentrated, so leftover droplets or condensates will appear as bright regions, regardless of whether they are in front or behind the focal plane. A crack in the glass would ordinarily be invisible to fluorescence microscopy – except for its ability to accumulate such substances. Glass will normally be coated with substances to prevent the direct binding of fluorophores to it; however, any damage to the fragile coating will cause fluorescent streaks. Illumination with a coherent source such as a laser, as opposed to a broadband source such as a xenon lamp, has specific artefacts such as speckle. In addition, the arrays themselves are manufactured through photolithographic techniques and may contain occasional damage.
Dirt
The visible bright artefact at the bottom-left of Figure 3 is the only blemish in our dataset that did not require 'harshlighting' to be visible. The magnification in Figure 5a shows a structure in an area of 25 × 25 probes. Figure 5b shows the corresponding area in the raw image, clearly exhibiting this artefact to be a piece of debris lying in front of the active array surface in the optical path. While the exact physical nature of this debris is unclear, there can be no doubt that probes highlighted at the bottom of Figure 3b are, in fact, a blemish.
Dark and bright spots
A very 'dark spot' was seen in the lower left corner of Figure 3b. The probe level pseudo image (Figure 6a) shows a dark region, but only the raw image reveals the characteristic of this blemish: an elliptical spot with sharp boundaries which pass through the inside of probes. Still, the grid is visible underneath, as in one of the examples given by Simon, Korn, et al. [13] for cDNA arrays. The dark probes in Figure 6a are therefore likely to be caused by a physical blemish that has 'stained' the image with a dark oval, a mechanical/optical artefact that invalidates the measured intensities of the probes in the region, so all affected probes in the region should be excluded from further analysis. The 'dark spot' in Figure 2 (upper centre) also had a well defined border, although with less contrast (not shown). Three similar artefacts were seen in yet another chip, as shown in the composite picture (Figure 3).
The bright spot on the upper right corner Figure 2 clearly is of different nature. The zoomed area of the DAT file of the second chip (activated) of subject 2 shown in Figure 7b reveals a diffuse area of brightness that covers around 20 probes. Because this bright cloud is out of focus, it is difficult to assess whether its physical location was in front of or behind the focal plane; it could be a leftover detergent condensate in the plastic back panel of the chip. The artefact is less visible in the pseudo image than in the raw image, because the low granularity of the pseudo image enforces an artificial grid structure. Moreover, the Affymetrix image analysis algorithm, taking the 75 percentile of the pixels as an estimate of the probe, may make it more difficult to detect these artefacts through visual inspection because the brightness in areas with low pixel-to-pixel variation is lowered for all percentiles above the median. Although they were easily seen in the filtered pseudo image, neither the 'bright spot' nor the 'dark spot' could have been identified by visual inspection of the original pseudo image. Even on the raw image, only an extremely thorough search for areas of low pixel-to-pixel contrast or boundaries with high contrast across probes could have detected these artefacts based on a single chip alone. Thus, blemishes involving only 9 to 25 probes would often be overlooked in a visual inspection of both the raw and the pseudo image. Given the high variance across pixels, any image processing algorithm aiming at detecting such blemishes at high sensitivity would also create many false positive results.
Dark clouds
For the 'dark clouds', the raw image at first did not show any recognizable feature. Upon closer inspection, however, we noted that the 'dark cloud' in subject 1 had higher pixel-to-pixel variance (Figure 8). The noise does not seem to have a physical origin, as the fluctuations appear to be single-pixel in extent, giving the raw image a 'grainy' appearance.
The areas outside the dark clouds do not appear to be any grainier, so it does not seem to be a change of exposure setting or other simple global change. The image analysis software reports a single, global pixel-to-pixel variation Qraw; it would be useful to have a local quality measure as well, in a fashion similar to the reported background estimate for probe intensities. All dark clouds we found impinge on the array borders. We have no conjecture as to the physical origin of this problem.
Shadowy circles
The two artefacts crossing the left border of Figure 2 suggest yet another reason for blemishes on microarrays. Only one of our chips displayed this artefact, but it did so twice on the same border. Neither the raw image nor physical examination of the chip in a dissection microscope provided any hints to the possible cause (data not shown).
There are myriad possible explanations for what caused this striking artefact. A perfectly round structure with outliers concentrated near its perimeter, evocative of the 'coffee stain rings' phenomenon [14], suggests that a bubble (or a drop) may have formed, during the microfluidic stage, condensation after the washing stage, or as a manufacturing defect.
Thus, to further elucidate the potential cause of this artefact, we plotted the observed vs. the expected intensity (median across the other five chips) for each probe in the area depicted above (Figure 9). We then marked the points below the .10 percentile of all deviations (3) in this area, which formed the 'shadowy circle'. These points were seen over a wide range in expected intensity (7 to 14 in log2 units), although their density is higher for lower intensities. Notably, their intensity was consistently lower than the expected intensity, as though something had only partially interfered with hybridization – or partially stripped the fluorophores prior to readout, or affected probe sensitivity.
Relevance
To determine the extent to which such artefacts may affect standard analyses, we compared the activated vs. the repressed samples (two each) for patient 2, and studied whether masking the blemishes affects the list of differentially expressed genes.
We searched for blemishes all four chips; after manually circling each affected area, we masked (declared missing) all points in the upper or lower 10th percentile within that area, respectively. We used either the lower or upper 10% since one of our findings is that all artefacts seem to have the common characteristic shown, for instance, in Figure 9, that outliers within an artefact are either (almost) exclusively brighter ('bright spot') or darker (all other blemishes) than expected We conducted separate analyses for the original and the masked data. We estimated the signal value for each probe using the Bioconductor implementation (affy package 1.3.28, R.1.8) of the MAS5 algorithm with default parameters, after modifying the summarization and normalization steps to allow for missing data. The overall effect is shown in Figure 10a, with a maximum difference of 4.6 log2.
Genes whose expression estimates changed by more than 0.1 log2 through filtering were considered as 'altered' by filtering. The 'bright spot', where about 39 probes were affected, altered the expression of 16 genes by up to 1.37 log2. The 'shadowy circle' altered the expression of about 380 genes; more than 50 of them by more than 0.5 log2. The 'dark spot' affected 47 probes, altering expression of 103 genes by up to 1.6 log2. The 'cloud' altered the expression of 700 genes, 83 of them by more than 0.5 log2. The dirt covering around 25 × 25 probes, affected around 376 probes, altering 148 genes, 16 of them by more than 0.5 to a maximum of 1.26 log2.
Finally, we compared the two conditions (absence vs. presence of a repressor), mirroring masked probes on both on the affected and the corresponding chip. As an exploratory criterion, we used the modified (paired) t-test suggested in Smyth [15] from the limma package of the Bioconductor project [16]. As shown in Figure 10b, the effects of identifying genes as differentially expressed can be dramatic, demonstrating the potential value of detecting blemishes and masking affected areas on microarrays.
Validity
We validate the proposed method using data from the Spike-in HUG133 experiments [17]. This data set consists of 3 technical replicates of 14 separate hybridizations of 42 spiked transcripts at concentrations from 0.125 pM to 512 pM arrayed as a Latin Square. Our interest is to assess whether masking the blemishes improves the ability to detect differentially expressed genes. We used the Affycomp package of the Bioconductor project, which encompass a series of tools developed by [18] to compare the performance of expression measures for Affymetrix GeneChips. Figure 11 shows that masking blemishes has little effect for large fold changes, as one would expect, while the ROC curve (sensitivity) vs. (1-specificity) shows a substantial improvement for small (2 fold) changes. Other statistics are also improved in this case: the average false positive decreases (from 2818 to2763) while the true positives increases (from 14.33 to 14.57). Comparing by range of intensities, the area under the curve (AUC) is bigger for the masked data in the lower intensities (0.003 vs. 0.010) while keep similar performance in the medium and low range (data not shown), resulting in a bigger average weighted AUC for the filtered data (0.002 vs. 0.007) (a detailed description of these statistics can be found in [18] and in the affycomp vignette). Thus, our masking procedure improves the sensitivity/specificity to detect small differential expression, especially in the range of low intensities.