Characterization and simulation of cDNA microarray spots using a novel mathematical model

Background The quality of cDNA microarray data is crucial for expanding its application to other research areas, such as the study of gene regulatory networks. Despite the fact that a number of algorithms have been suggested to increase the accuracy of microarray gene expression data, it is necessary to obtain reliable microarray images by improving wet-lab experiments. As the first step of a cDNA microarray experiment, spotting cDNA probes is critical to determining the quality of spot images. Results We developed a governing equation of cDNA deposition during evaporation of a drop in the microarray spotting process. The governing equation included four parameters: the surface site density on the support, the extrapolated equilibrium constant for the binding of cDNA molecules with surface sites on glass slides, the macromolecular interaction factor, and the volume constant of a drop of cDNA solution. We simulated cDNA deposition from the single model equation by varying the value of the parameters. The morphology of the resulting cDNA deposit can be classified into three types: a doughnut shape, a peak shape, and a volcano shape. The spot morphology can be changed into a flat shape by varying the experimental conditions while considering the parameters of the governing equation of cDNA deposition. The four parameters were estimated by fitting the governing equation to the real microarray images. With the results of the simulation and the parameter estimation, the phenomenon of the formation of cDNA deposits in each type was investigated. Conclusion This study explains how various spot shapes can exist and suggests which parameters are to be adjusted for obtaining a good spot. This system is able to explore the cDNA microarray spotting process in a predictable, manageable and descriptive manner. We hope it can provide a way to predict the incidents that can occur during a real cDNA microarray experiment, and produce useful data for several research applications involving cDNA microarrays.


Background
With the advance in technology for simultaneous acquisition of information on a number of genes' expression, the research area of bioinformatics is expanding into the reconstruction of a system composed of such genes. Within a decade, the technology of cDNA microarray experiments has been extensively developed, and its usage has exponentially increased. To improve the accuracy of microarray data, a number of techniques in post-processing microarray data were suggested from image analysis to statistical data processing. However, in spite of those efforts, the increase in accuracy of post-processing has a limit. After all, the improvement in wet-lab experiments, such as ensuring the conditions for sound spotting of cDNA probes, is necessary to obtain reliable microarray data. Most cDNA microarray images contain spots in various shapes, including doughnut-shaped spots, which consist of pixels of high intensity at the perimeter and those of low intensity in the central area. Such patterns of spot images are primarily due to the non-uniform distribution of cDNA molecules while the cDNA solution dries out during the microarray printing process.
Spot morphology influences the measurement of gene expression levels. In particular, doughnut-shaped spots among them can often cause errors in measuring the signals. The inner hole can be either a blank or an area composed of pixels of lower signal than those in the perimeter ( Figure 1). When calculating the measure of a spot signal, there is a large difference between the inclusion and exclusion of the hole in the spot area. If the hole is a complete blank, the area of the hole should be excluded from the spot area when the spot signal is measured. Otherwise, it should not be excluded. To overcome this, several techniques have been developed for detecting the effective area of a spot [1,2]. They basically require some thresholds or parameters that are critical in determining the effective area. In a microarray image, each of the thousands of spots has a unique morphology and range of pixel intensity. Therefore, it is not proper to use a single method to calculate gene expression levels from the spots in various shapes and characteristics.
Since microarray became widely used for obtaining highthroughput gene expression data, several methods for simulating the microarray experiment have been suggested [3][4][5][6][7]. Simulation is useful for designing and testing both the experiment and the analysis in real fields of study. This makes it possible to predict how the experiment will proceed or how well the analysis will work, before it is realized. In addition, it also provides synthetic data for analysis, when the realization is impossible due to a technical problem. Previous methods focused on mimicking images generated by real cDNA microarray experiments, and their parameters for determining the features of spots were based on certain probabilistic models. Wierling et al. [4] classified spots into three categories: convex, crater-like, and cylindrical spots. Balagurunathan et al. [3] simulated spots from simple circles and modified them by resizing radii, punching holes in the centre of spots, reshaping spots by removing chords, and enhancing intensities at the edge of spots. However, if it were not based on the investigation of physical and chemical nature, it would fail to correctly simulate and might confuse the experimenters with the results.
In this respect, it is important to study the formation of the variously shaped spots. The phenomenon of deposition of solute drop on the free surface has been studied [8][9][10][11][12][13][14]. Deegan et al. explained the formation of a ringshaped stain when a coffee drop evaporates as a result of "pinning" of the contact line. Heim et al. [15] noticed the importance of study of the deposition of DNA unspecifically bound on the microarray slide. However, microarray experiments are performed on slides specially manufactured for the purpose of increasing the sensitivity and the specificity of the results. A capping step to prevent nonspecific adsorption on the support during hybridization and to decrease the background noise is often performed. To minimize the loss of cDNA by hybridization or washing out, the surface of a glass slide is usually coated with substrates [16]. Therefore, it is necessary to study the characteristics of cDNA deposition by close investigation focused on the microarray experiment.
Various spot shapes in real microarray images Figure 1 Various spot shapes in real microarray images. (A) Real microarray images contain spots in various shapes [38] [40]. (B) The 3D image of spots in the real microarray images were reconstructed by plotting image intensity on the z-axis (C) The radial profiles of the 3D images were obtained where the section passed through the centre.
In this study, we generate a mathematical model of cDNA deposition during the microarray spotting process using the contact printing technology by bringing parameters from cDNA microarray experiments. This study analyzes cDNA microarray spot formation and elucidates parameters which affect the spot morphology. The study of the origin of various patterns in spot morphology suggests how to manufacture good microarray spots.

cDNA deposit model generation
We consider a drop of cDNA solution on a glass slide coated with a chemical layer bearing ended function able to bind strongly with cDNA. While the cDNA solution is spotted on the slide, the strong adhesive forces of the chemical layer pull a drop of cDNA solution reserved in the channel of the spotting pin, and the cDNA molecules bind to the surface sites of the chemical layer so as to be immobilized. While the drop is dehydrated, the concentration of cDNA changes in two ways simultaneously: 1) it increases due to the evaporation of water at the surface of the drop; 2) it also decreases due to the deposition of cDNA molecules on the surface of the glass slide dried out from the rim of the drop. Because water evaporates at an equal rate across the surface of the drop, the rate of change of the volume V is proportional to the surface area S, such that where K E is the evaporation constant reflecting the conditions of the experiment, such as temperature, humidity, and atmospheric pressure, etc.
When a drop of cDNA solution is spotted on a glass slide, it has a hemispheric-like shape on the condition of the surface tension of the drop and the hydrophobicity of the original surface of the slide. If we consider the drop as a spherical cap, the volume and the surface area are proportional to the cube and the square of the contact radius r, such that V(r) = K V r 3 and S(r) = K S r 2 . K V and K S are the volume and the surface constants, respectively, such that where is the contact angle of drop on the slide. Because the drop undergoes morphological transition during evaporation, the contact angle is a function of time t, = (t), and the rate of change in volume is Several studies have investigated the morphological transition of a drop during its evaporation [17][18][19][20]. At the beginning of evaporation, the contact line is pinned, i.e., while the contact radius remains constant, the drop diminishes in volume as the contact angle and the drop height decrease (State I). When the contact angle and the drop height decrease to a certain level, the contact line is depinned, i.e., the contact radius starts to retract with the decrease of drop height and the contact angle becomes constant (State II).
At state I, cDNA molecules bind with the surface sites in the whole contact area of the drop in the chemical equilibrium state, as follows: where x 0 is the surface site density on the support and a is the interaction factor for the non-ideal adsorption behaviour of macromolecules [21]. If a = 0, the equation is the same as the Langmuir adsorption equation, and otherwise, it describes the non-ideal adsorption behaviours, such as repulsion (a > 0) and attraction (a < 0) between molecules [22,23]. Therefore, the equilibrium constant for the binding reaction is Ke -ax , where K is the extrapolated equilibrium constant. The concentration of cDNA, , rises during evaporation without reduction of the radius of the drop, and, therefore, the quantity of cDNA molecules, x, which bind with the surface sites in a unit area increases nonlinearly.
At state II, as the contact line moves back, the cDNA molecules start to deposit at the contact line. The recession of the contact line prevents cDNA molecules from dissolving back into the drop. Because the contact angle becomes constant, K V and K S do not vary with time, and, therefore, the second term on the right-hand side of the equation (3) can be omitted. Then the rate of change in the contact radius can be also constant, as in Let (t) represent the concentration of cDNA at time t when the contact radius becomes r(t). If we assume that the density of the surface sites is constant in the contact area of a given drop, then where (0)is the cDNA concentration at the moment beginning deposition, and V 0 is the initial volume of the drop. Then, equation (6) can be simplified under the assumption that the contact angle is constant after the beginning of cDNA deposition, such as where r 0 = r(0). From the temporal derivative of the concentration of the drop, we can derive its radial derivative, as in The radial derivative of the deposit can be obtained from equation (4) and (7), as follows: The solution of equation (8) is the density of cDNA deposit at the distance r from the centre, which can be solved by an ordinary differential equation solver with the four parameters, x 0 , K, a, and K V .

Spot simulation
From a real microarray image [24], we identified the three types of spots ( Figure 2). Using the radial cDNA deposit profile obtained from equation (8) with varying parameters, x 0 , K, a, and K V , we obtained several types of radial cDNA deposit profiles ( Figure 3).

Doughnut-shaped spot
The doughnut-shaped spot was defined as the spot having a thin rim of high density of deposited cDNA and a large hole of very low density, approximated to zero (Figure 4a, left and middle). The diameters of the inner hole of the doughnut-shaped spots were measured by varying the parameters. Histograms of the density of deposited cDNA were bimodal, which had a peak at zero (Figure 4a, right). If the inner hole is included in measuring the signal, then there will be a discrepancy between the measures and the true value. To verify which parameter can influence the morphology of the doughnut-shaped spot, we changed dx dr Three types of spot morphology in a real microarray image each parameter by 20 steps while the others were fixed. The inner hole was decreased by increasing a, and K V and decreasing x 0 and K ( Figure 5).

Peak-shaped spot
The peak-shaped spot was defined as a spot having the highest density at the central region and a declining density at the region farther from the centre (Figure 4b, left and middle). Because the peak-shaped spot can have an obscure boundary with the background, the edge of the spot can often be determined by the threshold. The histogram of the density of deposited cDNA was skewed to the right (Figure 4b, right). Therefore, the mean density of deposited cDNA is lower than the median value. Furthermore, the measure can vary with how large an area of spot is included.

Volcano-shaped spot
We defined the peak-shaped spot having a small hole in its peak as the volcano-shaped spot because its 3D image resembles a volcano (Figure 4c, left and middle). The volcanic-shaped spot has characteristics of both the doughnut-shaped spot and the peak-shaped spot (Figure 4c, right).

Simulated cDNA deposit
Using the radial profile of cDNA deposits and the spot template images, we simulated the deposition of cDNA in a virtual spot. In each spot template image, we set coordinates of pixels on the edge contour and centre of the spot area ( Figure 6a). Using the distance from the centre to each pixel on the edge contour, we calculated cDNA concentration at the moment when the contact radius becomes r, and the quantity of cDNA molecules that bind with the surface sites in a unit area (Figure 6b).

Parameter estimation from real microarray images
The spot morphology varies with the size of the end of the spotting pin tip, the volume of sample delivered, the surface tension of the drop, the hydrophobicity of the surface of the slide, and so on. It has been proven that the increase in viscosity and contact angle of a micro-drop shrinks the initial size of the drop [25]. Using the data and equation (2), we calculated the volume constants and the contact angles of the common micro-drops at the moment of being spotted on the glass slides, with the data of the spot diameters and the volumes of delivered sample provided by TeleChem International, Inc. [26]. We found that the Three types of morphology of simulated cDNA deposit We estimated the four parameters from the real microarray spot images. Our model is one of the constrained nonlinear optimization (CNO) problems, so that the optimum values of the parameters of the ordinary differential equation were obtained by sequential quadratic programming (SQP) [27], with the least squares method, as follows: where = (x 0 , K, a, K V ), y i is the normalized pixel intensity of a real microarray spot at distance r i from the centre of the spot. SQP was performed on the spots of which outer rims are circular using their radial profiles (as shown in Figure 1c). Because the number of pixels in a spot is not enough to fit the model, we inserted 16 dummy data points between every pixel point on the radial profile by using cubic Hermite spline interpolation [28]. From each image, five spots representing each type of spot shape were selected. The results of the study of the microarray scanning process indicated that the range of the concentration of mRNA of thousands of genes in a sample is much wider than the dynamic range of microarray scanners [29]. If the microarray scanner reads the signal out of the linear dynamic range, the radial profile of the spot deposit would be distorted. To focus on the physical phenomenon of the deposition of cDNA molecules, the fitting was performed on a limited number of spot images whose pixel intensities were proper to be within the linear dynamic range. The estimated values of x 0 and K of the doughnut-shaped spots were about ten times as high as those of the peak-shaped spots, while a and K V of the former were lower than those of the latter (Table 1 and 2). The macromolecular interaction factor, a, is usually a positive value because cDNA molecules are usually negatively charged, and so there is a repulsive force between cDNA molecules. However, their electrical characteristics can be changed, depending on the pH of the buffer solution. The estimated parameters of the volcano-shaped spots were highly variable. Because the volcano-shaped spots contained the characteristics of both the doughnut-shaped spot and the peak-shaped spot, they have a wide variety in the morphology. Table 1 and Table 2 show that the value of x 0 is also highly dependent on the slide.

Discussion
Unlike the ring-shaped stain formed during the evaporation of a coffee drop, cDNA microarray images produce spots of various shapes, even in the same image. A solution to the coffee drop problem can partly explain the doughnut-shaped spot, which is one extreme of spot morphology in cDNA microarray images. Doughnut-shaped spots have frequently appeared in microarray images. In spite of the effort to prevent such spots from being generated, there is still an uneven density of signals in a single spot. In this study, we devised a generalized mathematical model that can manifest a wide spectrum of spot morphology.
The doughnut-shaped spot would be produced by experiment under the condition that a large quantity of cDNA is deposited early on, and so the concentration of the drop solution is quickly lowered. Such a situation can occur when the adsorption reaction is facilitated by a high sur-face site density, a large equilibrium constant of the binding reaction, and a low repulsion or attraction between cDNA molecules, or when the initial drop has a flat spherical cap shape. Unlike the doughnut-shaped spot, the peak-shaped spot would be produced under the condition that cDNA is not likely deposited early on, and the concentration of the drop solution continually increases due to the accumulated cDNA, and so the quantity of the deposited cDNA can be the maximum at the centre. Such a situation can occur when the adsorption reaction is impeded by a low surface site density, a small equilibrium constant of the binding reaction, and a high repulsion between cDNA molecules, or when the initial drop has a sphere-like shape.
It is known that the composition of buffer solution changes the spot morphology [30][31][32][33]. The optimal composition of buffer solution has been investigated by including detergent or betaine to 3 × SSC or 50% dimethyl-sulfoxide (DMSO). It is known that DMSO decreases the surface tension of drop solution during evaporation, while both 3 × SSC and 3 × SSC with betaine increase the surface tension [32]. The decrease of surface tension is *** Mean squared deviation, . † The spot shape was close to the doughnut-shape. † † The spot shape was close to the peak-shape. Moreover, DMSO denatures DNA to be bound well with surface sites [34], which means the increases of the equilibrium constant for the binding reaction, Ke -ax . Putting them together, DMSO would decrease K V and a, and increase K. Figure 3 and 5 show that the doughnut shape is aggravated by increasing K and x 0 and decreasing K V and a. As a result, DMSO would aggravate the doughnut shape. In addition, there is a study that the length and the sequence arrangement of DNA molecules would influence on the morphological organization of the deposit [35]. The macromolecular interaction factor, a, would reflect the length and sequence arrangement of DNA molecules.
The extrapolated equilibrium constant for the binding reaction K also reflects the condition of the slide surface. While homemade poly-L-lysine-coated glass slides have been widely used, several commercial microarray slides are now preferred, such as the FMB cDNA slide (Full Moon Biosystems Inc.), ArrayIt SuperAmine and SuperAldehyde slides (TeleChem International, Inc.). It is known that CMT-GAPS™ slides (Corning) makes the spot mor-phology much more uniform by preventing doughnut shape formation [34]. It can be assumed that the slides are manufactured to have the optimal binding efficiency. The surface site density, x 0 , affects the efficiency of cDNA binding. The increase in x 0 facilitates the forward reaction of equation (4)   *** Mean squared deviation, . † The spot shape was close to the doughnut-shape. † † The spot shape was close to the peak-shape.  (11), the ratio of x 5 to x 3 can be linear to the ratio of mRNA concentration only when the quantity of hybridized mRNA can be approximated to be linear to the concentration of mRNA in the solution. Where the density of cDNA is much lower, such as in the centre of the doughnut-shaped spot, the quantity of the hybridized mRNA is saturated early in this region. Because the concentration of mRNA is uniform in the solution, the quantity of hybridized mRNA cannot be linear to the concentration of mRNA. Then the ratio of the signal fails to reflect the original ratio of mRNA expression. When we assume that x, the local density of cDNA, is much larger than (x 3 + x 5 ), equation (11) can be simplified, as follows: We have estimated parameters from real microarray images, but not all spot images were successful. It was practically impossible to fit our model equation to the doughnut-shaped spots, whose centre part has significantly high intensity. Table 1 and Table 2 show the mean squared deviation (MSD) between real and estimated data, as the goodness-of-fit measure. MSDs obtained from the doughnut-shaped spots were relatively larger than those of the other spots. We expect this problem originates from two major reasons: the effect of diffusion when the microarray slide passes through the hybridization process, and nonlinear transformation of spot signals to image intensity when the region out of dynamic range of the microarray scanner was used in converting spot signals. The effect of hybridization on the change in cDNA density is still controversial. Tran et al. [36] claimed that DNA spotted on a glass slide diffuses during hybridization. However, this conflicts with the opinion of Pappaert et al. [37], who claimed that the doughnut shape occurs not only from the uneven distribution of DNA deposition but also from the hybridization process. We observed a few microarray images that have concentric circles in the spots. Even though we did not include such phenomenon in this study, it is considered a result of the repetition of pinning and depinning of the contact line [10], and should be investigated further.

Conclusion
We developed a governing equation that can explain the dynamics of cDNA deposition during evaporation of a drop in the microarray spotting process. Experimental conditions were parameterized and included in the governing equation. The parameters determining the spot morphology were brought from the physical and chemical factors. This explains how various spot shapes can exist and suggests which parameters are to be adjusted for obtaining a good spot. This system is able to explore the cDNA microarray spotting process in a predictable, man-ageable and descriptive manner. We hope it provides a way to predict the incidents that can occur during a real cDNA microarray experiment, and produce useful data for several research applications involving cDNA microarrays.

Mathematical model
Equation (2) was obtained as follows: consider a spherical cap which has the radius of sphere R, and cap height h. Then the volume V and the surface area S are When we consider the contact radius r and the contact angle , then R = r csc h = R(1 -cos ) = r csc (1 -cos ).
Then the volume and the surface area are the functions of the contact radius and the contact angle, such as Equation (5) was obtained by combining equation (1) and (3), such that Because the contact angle is constant at state II, K V does not vary with time, and therefore, the first term on the lefthand side of the equation above can be omitted, as follows:

Tools for analysis and visualization
The ordinary differential equation, equation (8) was solved by applying the backward differentiation formula (BDF) method. cDNA deposits were simulated with a radius of 100 m, which were converted to images at a resolution of 10 m/pixel using LabVIEW and NI-Vision 8.2 (National Instrument, Inc.). The templates of spots were obtained by using the edge detection technique [1]. BDF,