Variational attenuation correction in two-view confocal microscopy

Background Absorption and refraction induced signal attenuation can seriously hinder the extraction of quantitative information from confocal microscopic data. This signal attenuation can be estimated and corrected by algorithms that use physical image formation models. Especially in thick heterogeneous samples, current single view based models are unable to solve the underdetermined problem of estimating the attenuation-free intensities. Results We present a variational approach to estimate both, the real intensities and the spatially variant attenuation from two views of the same sample from opposite sides. Assuming noise-free measurements throughout the whole volume and pure absorption, this would in theory allow a perfect reconstruction without further assumptions. To cope with real world data, our approach respects photon noise, estimates apparent bleaching between the two recordings, and constrains the attenuation field to be smooth and sparse to avoid spurious attenuation estimates in regions lacking valid measurements. Conclusions We quantify the reconstruction quality on simulated data and compare it to the state-of-the art two-view approach and commonly used one-factor-per-slice approaches like the exponential decay model. Additionally we show its real-world applicability on model organisms from zoology (zebrafish) and botany (Arabidopsis). The results from these experiments show that the proposed approach improves the quantification of confocal microscopic data of thick specimen.


Background
Confocal microscopy has become a standard technique to record and localize fluorescent marker molecules within the 3-D context of organs and whole organisms on subcellular resolution. The confocal principle minimizes the blur introduced by the point spread function of the optics. However, signal degradations introduced by scattering and absorption within the inhomogeneous tissue still hamper many automatic image analysis steps like detection, registration, segmentation, or co-localization.
Light attenuation is a result of photon loss along the excitation and emission light paths. Photons get lost due to absorption, where the photons are converted to thermal energy, or due to scattering, where the photons leave *Correspondence: tschmidt@cs.uni-freiburg.de 1 Department of Computer Science, Albert-Ludwigs-Universität, Georges-Köhler-Allee Geb. 52, 79110 Freiburg, Germany Full list of author information is available at the end of the article the ray passing through the pinhole. Both effects result in a multiplicative reduction of the number of photons by a local tissue specific factor, and can therefore be modeled by the Beer-Lambert's law. The opposite effect, an intensity increase, is caused by scattered photons that hit the pinhole by chance. In most tissues this second effect is small compared to the photon loss and its exact simulation would require an immense computational effort. Therefore we model only photon loss using attenuation coefficients accounting for both local absorption and scattering throughout the article.
Attenuation correction requires to estimate two quantities at each recording position, the local attenuation coefficient and the true underlying intensity. Solving for both quantities without further assumptions would require two noise-free measurements per recording position. However in most real-world applications only sparse measurements at the fluorescently marked structures are available http://www.biomedcentral.com/1471-2105/14/366 (especially when imaging whole organs or organisms). Additionally the measured signal is distorted by Poisson distributed photon noise and Gaussian distributed read-out noise.
Single view approaches try to estimate both quantities from one recording that provides only one measurement per recording position. This requires strong prior assumptions to constrain the solution space. A common approach is to assume that the attenuation is dominated by aberrations introduced by a mismatch in immersion and embedding media [1,2]. In the resulting models, local attenuation effects are neglected or constant absorption throughout the cuboid-shaped recording volume is assumed resulting in an exponential decay with imaging depth [3]. Other approaches estimate the attenuation from the perslice intensity statistics. The overall intensity distribution is adapted towards a reference maximizing the overall coherence [4,5].
One way of theoretically getting sufficiently many measurements to solve the problem is to record the sample from different angles (e.g. two views from opposite sides, see Figure 1). In [6] this has been done to increase the signal to noise ratio (SNR) of the reconstructed volume. The authors discuss, that previous approaches are only applicable given homogeneously distributed markers throughout the sample which is hardly the case. They propose instead to directly relate the absorption to the fluorophore distribution that can be observed. In [7], we go even one step further and assume no relationship between attenuation and marker, since only in rare cases all absorbing material is also fluorescently marked. The confocal image formation [7,8] allows to recover attenuation in not fluorescently marked areas as long as they cast "shadows" through the sample along the excitation and emission cones of the different views. Only in regions where the light hits no fluorophores at all or in the case of full absorption an estimation is impossible.
The multi-view approach can be applied to a wide range of data from biology and medicine. To underline this claim, we reconstruct the recordings of 500 μm thick zebrafish embryos, and Arabidopsis root tips. Two-view recording of tissue sections was already demonstrated in [6].

Contributions
This work is a significant extension to the attenuation correction presented in [7] extending the ideas and the evaluation presented in [9]. First, we use an elastic registration to align the two views which allows embeddings in viscous media without mechanical fixation. Second, we reformulate the image formation model to cope with the photon noise apparent in confocal microscopy and photo bleaching which cannot be avoided when recording the same sample multiple times. Third, we examine the effects of different priors (Tikhonov-Miller, total variation, and sparsity) on the attenuation field, and finally, we ensure the constraints on the variables (attenuation coefficients must be positive) directly in the optimizer. The effects of the different extensions are illustrated in Figure 2.

Image formation model
We use the image formation model and algorithms from [7] to simulate a confocal microscope with ideal point spread function. Optimal reconstruction quality using this model requires that the refractive indices of immersion , (e) with sparsity (μ = 1000), (f) with bleaching correction (est. β = 0.64); right: (g) with total variation regularization (λ = 10 4 ), (h) with sparsity (μ = 1000), (i) with bleaching correction (est. β = 0.63). Parts in the gray block are already implemented in [7]. Scale bars: 200 μm. and embedding medium match the specification of the microscope lens. They should be adapted to the average refractive index of the imaged specimen. The signal F i (x) for direction i (i = 1: top, i = 2: bottom) captured by the photo multiplier is modeled as the integral over a cone shaped bundle of rays originating at the recording http://www.biomedcentral.com/1471-2105/14/366 position x ∈ R 3 . Each ray is attenuated by the attenuation coefficients along its path so that where α : R 3 ⊃ → R ≥0 are the spatially variant attenuation coefficients. s i : S → {0, 1} are the cone sensitivity functions for both directions defined over the unit sphere S. s i (r) is one for all ray directions within the cone and zero otherwise. I : → R ≥0 denotes the attenuation free intensities. The factors β i ∈ R + can be used to additionally scale all intensities of the recordings. We use it to model photo bleaching induced signal attenuation in the second recording. Only the focused beam leads to significant bleaching since the excitation energy drops quadratically with the distance to the focal point.
The assumption of constant bleaching for the whole volume is a zero-order approximation for the true bleaching function which is non-linear and specific for the markermolecules used. We fix β 1 := 1 and optimize β 2 alongside with the real intensities I and the attenuation coefficients α as described in the upcoming section.

Energy formulation
We want to maximize the posterior probability for the attenuation coefficients α, the attenuation-free intensities I, and the factor β 2 given the observed intensities of the two recorded data volumes I 1 and I 2 .
We have no prior knowledge about the expected intensities but want the attenuation coefficients to vary smoothly within local neighborhoods. Additionally we prefer solutions with zero attenuation estimates in areas of insufficient data. This can be modeled in the prior probability as where ⊂ R 3 is the recorded volume, λ, μ ∈ R ≥0 are weighting factors and sp ∈ R + is a small constant which is added for reasons of numerical stability. The loss function ψ : R → R is either the identity function, leading to quadratic regularization (Tikhonov Miller (TM)) or approximates the total variation regularization (TV) when set to ψ TV x 2 := x 2 + 2 TV with TV ∈ R + being a small constant. This approximation is closely related to the Huber norm. During the experiments we set sp = TV = 10 −10 . We assume, that for all discrete recording positions x the observed attenuation coefficients and intensities are independent (except for their explicitly modeled common dependency on α, I, and β 2 ) resulting in where is the discretized recorded volume. The noise in the image intensities is dominated by the Poisson distributed shot noise due to the quantum nature of light but the measured intensities also contain additive Gaussian distributed read-out noise. The resulting noise model is the convolution of a Poisson process (scaled by a constant factor m ∈ R + and offset b ∈ R) with a zero mean Gaussian distribution with standard deviation σ ∈ R + . We eliminate the offset b by subtracting the intensity at the histogram peak from each recording prior to further processing. For reasons of computational efficiency, we approximate the Poisson process by a Gaussian process with variable variance (the variance is proportional to the mean of the distribution). We further assume, that the measured intensities are a good estimator of this mean value. At each position x ∈ this leads to a convolution of two Gaussian distributions which results in the combined Gaussian distribution Note, that for m = 0 and σ = 1 the new model coincides with the pure Gaussian model presented in [7]. The actual value for the Poisson scaling m (the number of collected photons per intensity level) and the standard deviation σ of the Gaussian noise can be estimated during a microscope calibration phase. If they are unknown, one of them can be fixed to an arbitrary value (we always fixed σ = 1), and the other one can be adjusted to qualitatively obtain the optimum result. If additional sample information is available, e.g. the recordings consist of large homogeneous regions of different intensities, one can also try to estimate the parameters from the images themselves as done in [10]. However, for biological samples this is rarely the case. http://www.biomedcentral.com/1471-2105/14/366 The final energy formulation is obtained by replacing the maximization of the probability by a minimization of its negative logarithm To simplify the notation, we introduce the shorthands where T r is the attenuation along the ray with direction r, C i is the cone transmission for recording direction i, F i are the simulated intensities, and D i are the differences between the recordings and simulations. Variables in square brackets indicate dependencies on the corresponding optimization variables.
For the optimization we employ the Broyden-Fletcher-Goldfarb-Shanno algorithm with box constraints on the variables (short L-BFGS-B) [11]. The solver minimizes the energy while respecting the positivity of the attenuations throughout the iterative optimization. L-BFGS-B implements a quasi Newton method, therefore, we need to provide the derivatives of the energy with respect to the unknown intensities I, the attenuation coefficients α, and the bleaching factor β 2 . These are given by where the derivative of the loss function ψ TM x 2 = 1 for the TM regularization and ψ TV x 2 = 1 2 x 2 + 2 TV for the TV approximation. The detailed derivations are given in the Additional file 1.
The bleaching factor β 2 prevents a direct analytic optimization of the intensities at each quasi Newton iteration as presented in [7]. Instead we optimize the bleaching factor β 2 and intensities I within each quasi Newton iteration in an inner fixed-point iteration loop that alternates between analytic computation of

Implementation
The variational attenuation correction was implemented in C++ and run under Linux (Ubuntu 12.04) on an Intel Xeon E5-2680 (2.7GHz) Dual-Processor system. For the optimization we used the ready FORTRAN implementation of the L-BFGS-B optimizer. One iteration for data sub-sampled to 80 × 80 × 80 voxels needed on average 1.8 seconds, so a full reconstruction can be computed in the range of a few minutes. The complexity scales linearly with the number of voxels to process within each iteration (Additional file 1: Figure S3). The memory complexity also scales linearly with the raw data volume. Both quantities can be limited by sub-sampling the high resolution raw data. This has two advantages: First, less computational resources are needed and second, the weighted averaging during the sub-sampling already considerably reduces the image noise. The cone transmission is computed in parallel for all ray directions leading to a significant speed-up of the confocal microscope simulation. Depending on the

Discrete derivative and integral computation
The gradients needed in the derivatives of the TM smoothness term and in the sparsity term are computed using central differences. For TV regularization we extended the numerical differentiation-scheme from [12] to 3-D to obtain the divergence term in the derivative of the smoothness term. The corresponding equations are given in the Additional file 1. The cone integrals are approximated as in [7] with a ray spacing of six degrees. This approximation of the cone integrals requires high regularization to lead to good reconstructions. We also did experiments with an alternative ray integration scheme that uses thin rays instead of the incrementally widening conic rays of [7]. To still capture all attenuations the ray sampling was increased so that the cone is sampled densely at the largest cone diameter with respect to the volume grid. The result is given in the Additional file 1 and shows, that the energy formulation leads to the desired solution but the numerical approximations have crucial influence on the resulting reconstruction.

Data generation Synthetic data
To quantitatively evaluate our method, we generated two different synthetic datasets. The first consists of a solid sphere with constant absorption coefficients of 0.006 per voxel. The interior 60% of the sphere were set to an intensity value of 4094. We added a smooth random texture with a variance of approximately 30% of the maximum of the corresponding quantity. This corresponds to the "well-posed" case when intensity and absorption information is available in the whole domain. The second dataset is the well-known Shepp-Logan phantom [13] consisting of a set of overlapping ellipsoids with homogeneous intensities. We assigned absorption coefficients to the different regions avoiding direct correlation with the intensities. Some regions were assigned equal attenuations independent of their intensity difference. During the simulation we applied an anisotropic Gaussian smoothing to reflect the microscope's point spread function and ensure Nyquist sampling. For both datasets two recordings from opposite sides were simulated using (1) modeling absorption and photo bleaching (β 2 = 0.8, meaning 20% signal loss between the recordings). Then Poisson noise with scaling m = 0.05 and Gaussian noise with standard deviation σ = 10 were applied. The datasets are shown in Figure 3.

Zebrafish
To show that the approach also copes well with real world data, we tested it on samples of the ViBE-Z database consisting of confocal recordings of whole zebrafish (Danio rerio) embryos, which were fixed 72h after fertilization. Sample preparation, recording setup and image preprocessing are described in detail in [7]. The processing was performed on sub-sampled data with isotropic voxel extents of 8 μm.

Arabidopsis thaliana
Finally we tested the approach on recordings of the root tip of the model plant Arabidopsis thaliana. The samples were fixed 96h after germination and the cell membranes marked with an Alexa antibody stain. Then they were embedded in SlowFade Gold Antifade (Invitrogen) and recorded from top and bottom using a confocal microscope equipped with a 40× oil immersion objective. We applied the elastic registration algorithm from [7] to register the two views to each other. Finally we performed a background subtraction prior to applying the attenuation correction. The embedding medium had a refractive index of 1.42 compared to a refractive index of the immersion oil of 1.52 for which the lens was adjusted. The attenuation correction was performed on sub-sampled data with isotropic voxel extents of 2 μm.

Parameter setup
We want all terms in the energy to have approximately the same influence on the optimization process. This leads to rough rules of thumb for the selection of λ and μ. Since all terms integrate over the whole image domain, the choice is independent of the number of voxels. The energy contribution of the data term is in the order of the squared expected intensity differences between recording and simulation divided by the Poisson weights. The smoothness term's contribution is in the order of the magnitude of the expected attenuation gradient (TV) or its square (TM). Finally, the sparsity term's contribution is in the order of the expected attenuations. E.g. for intensity data with an expected residual intensity difference of 5 (corresponding to the average noise intensity) and pure Gaussian noise with expected attenuation coefficients of 0.005 and gradient magnitudes of 0.0005 initial choices of λ = 5 2 0.0005 2 = 4 · 10 8 and μ = 5 2 0.005 = 5000 (TM), resp. λ = 5 2 0.0005 = 5 · 10 4 and μ = 5000 (TV) are appropriate. The approximate estimates for the expected attenuations and their gradients were empirically confirmed on real world samples. For higher Poisson weighting m the factors have to be decreased accordingly. The optimal values depend on the image content and should be optimized for specific types of data.
For the synthetic phantoms we chose for each fixed m the optimal λ and μ which minimize the root mean http://www.biomedcentral.com/1471-2105/14/366 squared error (rmse) of the true intensities and the estimated intensities. The parameters were empirically determined with an exponential grid search over a parameter range of λ ∈ 0, 10 0 , . . . , 10 9 and μ ∈ 0, 10 2 , 10 3 , 10 4 for the textured sphere phantom, and λ ∈ 0, 10 6 , . . . , 10 12 and μ ∈ 0, 10 3 , . . . For the real world data we stopped the iterative process when the visually optimal reconstruction of the intensities was reached, which was after between 3 to 20 iterations. For the textured sphere phantom data we set a maximum of 50 iterations for TM regularization and ran the algorithm to convergence for TV regularization. All results reported for the Shepp-Logan phantom were reached at convergence of the algorithm.

Results and discussion
In Figure 2 the influence of the different extensions to the original model in [7] are summarized. If no prior knowledge about the attenuations is introduced (Figure 2 (c)) the approach is already able to reasonably reconstruct the original intensities. However, the attenuation field is coarse and cannot be applied to the reconstruction of secondary channels. With regularization (Figure 2 (d) and (e)) the attenuation field is much smoother, but especially with Tikhonov Miller regularization strong spurious attenuations outside the sample are estimated. Application of the sparsity term reduces these attenuation estimates (Figure 2 (f ) and (g)). The residual apparent "bleeding"  of the attenuation coefficients below the sample are the effect of different mean intensities in the top and bottom recordings, as e.g. introduced by photo bleaching. When this factor is additionally estimated during the optimization, the lower boundary becomes much clearer. Detailed evaluation of the proposed model on the data sets described in the methods section and a comparison to [7] are given in the remainder of this section.

Synthetic data
As a first baseline measure we computed the best possible outcome of the traditional one-factor-per-slice methods using the ground truth intensities of the synthetically generated phantoms. I.e. no method that assumes the correction factors to be a function of the z-position in the recorded volume can perform better than this. The optimal correction factor for each slice was computed by minimizing the rmse of the estimated intensities compared to the true intensities. The reconstruction error for slice z ⊂ is given by where the c i ∈ R are the correction factors for the topand bottom-view andÎ are the true intensities. The corresponding optimal correction factors c 1 and c 2 can be computed analytically to The reconstructions are shown in the column "Slicewise" of Figure 4. In both cases the one-factor-per-slice model was not able to reconstruct the interior intensities even though the true intensities were given. As second baseline we applied the attenuation correction from [7] to all datasets and empirically determined the best regularization parameter λ for each of them. We also empirically optimized the parameters λ and μ for our proposed scheme using the exponential grid introduced in the parameter setup section. The best results regarding the rmse of the intensities are given in Tables 1 and 2 and are depicted in Figure 4.
For the synthetic data, the new model clearly outperforms the baseline from [7] even for sub-optimal choices of the Poisson weight m. The increase in performance is clearer for the textured phantom in which our approximation to the real noise is less affected by suboptimal mean value estimates in low-intensity regions, but even for the Shepp-Logan phantom the reconstruction quality is increased almost by a factor of three. The noise model and the bleaching factor β 2 both affect the reconstruction significantly. The sparsity term also plays an important role for the reconstruction in two ways: Firstly, it avoids high attenuation estimates in regions with insufficient information; Secondly, it suppresses errors introduced by the discrete numerical approximations.
We evaluated the reconstruction quality with respect to the three parameters λ, μ, and m (TV: Figure 5 (a) and (b), TM: Figure 6 (a) and (b)). As quality measure we used the rmse of the estimated intensities. We found that the results are stable over a wide range of parameters. The parameter having the highest impact on the result is the smoothness weight λ, followed by m and finally μ. We also evaluated the evolution of the rmse during the optimization process for different choices of the parameters (TM: Figure 5   first decreases rapidly reaching a very good reconstruction after 30 to 60 iterations (From practical observations we found that for real world data the optimum is reached earlier). Beyond that point the boundary effects and inaccuracies in the applied numerical approximations lead to an increase in the rmse for the TM regularization. For high TV regularization the attenuations are well localized within the sample volume and therefore no significant attenuations are estimated at the boundaries. This results in monotonically decreasing rmses with small local fluctuations.

Zebrafish
Additionally to the result shown in Figure 1, we applied our method to other zebrafish samples with varying staining quality. The reconstructions with Tikhonov Miller regularization are shown in Figure 7 and for total variation regularization in Figure 8. The estimated attenuation coefficients clearly resemble the shapes of the embryos. The bright spots in the eyes stem from the strong refraction of the eyes' lenses showing the modeling limitations of the presented approach. However, due to the imposed priors the artifact is localized in a small region and affects the surrounding reconstruction only marginally. Figure 9 shows a comparison of the proposed approach to the one-factor-per-slice model and the baseline approach [7] for one fish. For [7] we set λ = 10 7 , for the proposed approach we used λ = 10 7 , μ = 10 3 and m = 0. The one-factor-per-slice model is not able to recover the intensity spectrum since it cannot change the ratio between the boundary and interior intensities. http://www.biomedcentral.com/1471-2105/14/366 In the brain region of the fish the baseline and the proposed approach estimate the same intensities, whereas the proposed approach emphasizes the tissue layers in the zebrafish eyes stronger. The proposed approach shows slightly smaller intensity overshoots at the eyes' surfaces and around the nose but overall both reconstructions are convincing. The apparent "bleeding" of the attenuation coefficients ventral to the fish is reduced. The staircasing artifacts along the back of the fish in the baseline approach which were introduced with the orthogonal subspace projections are effectively removed.

Arabidopsis thaliana
The mismatch in refractive indices of immersion and embedding media in the Arabidopsis sample preparation leads to an aberration induced signal loss, that is not modeled in the presented approach. However, Figure 10 shows that our method still accurately reconstructs the intensities of the root tip datasets. Figure 11 shows a comparison of the proposed approach to the one-factor-per-slice model and the baseline approach [7] for one root tip. Again the one-factor-per slice model cannot reconstruct the interior intensities. The reconstruction of [7] and the proposed approach both significantly enhance the root-internal contrast. However, an intensity gradient towards the root center remains visible. We assume, that it is induced by imperfect marker distributions due to incomplete tissue penetration and the properties of the inner cell membranes. The estimated attenuation field closely resembles the root's shape http://www.biomedcentral.com/1471-2105/14/366 and is homogeneous compared to the baseline approach. The baseline approach shows strong variation within the root and additionally estimates strong attenuation outside the root volume to cope with the bleaching effects. Such erroneous attenuation estimates may lead to reasonable reconstructions of the intensities of the channel they were estimated on, but they will fail in reconstructing secondary channels containing the markers to quantify like protein patterns.

Limitations of the approach
The exponential decay model along a ray is only strictly valid for pure absorption. In most cases local random refractions can be also described by this model. However, in areas with clearly structured refraction, as e.g. in the eyes of the zebrafish, where the light is actively bundled, the model is violated and localized errors in the attenuation estimates are introduced. We minimize the influence of these errors with high regularization, however, a better modeling of refraction would be a desirable -though practically very challenging -extension.
Another source of error is the limited recording volume. Samples exceeding this volume introduce the problem of sensibly guessing the outside attenuations the rays pass before entering the recording volume. Boundary effects can lead to solutions with low energies which are qualitatively far away from the optimum, especially when performing many iterations. In our image formation model we assume zero outside attenuations (natural boundary conditions), while for the regularization we assume Neumann boundary conditions. If possible, the recording volume should be increased to contain more background in cases of boundary problems. If this is not possible the TV regularization with its sharp boundaries is to prefer over the TM regularization. Additionally a high weight on the sparsity term alleviates effects that lead to extreme attenuation estimates. This can be the case when outside attenuations are explained by a thin highly absorbing region at the image boundary. An alternative, that leads to visually good, but energetically suboptimal results, is to restrict the number of iterations (less than ten iterations usually lead to qualitatively good results). This has the additional advantage of very low computation times.

Conclusions
We could significantly improve the results of the variational attenuation correction presented in [7] by additionally modeling photo bleaching and replacing the ad-hoc Gaussian noise assumption by the (approximate) Poisson-Gaussian statistics. The choice of the loss function in the smoothness term allows to choose between smoothly varying (TM) or piecewise constant (TV) attenuation fields. The choice of the appropriate regularization is application dependent. In our case both regularization strategies lead to equally plausible results in the rather inhomogeneous biological samples analyzed. TV regularization is more stable in practice because the attenuation is much better localized, and therefore less boundary artifacts -that may lead to convergence to undesired solutions -are introduced. For both regularizations the sparsity term also actively avoids boundary errors, by keeping the attenuation field compact. However, high sparsity weights lead to an underestimation of the attenuation volume and should be avoided. Instead, an earlier manual termination of the iterative process leads to very good results without introducing this side-effect. http://www.biomedcentral.com/1471-2105/14/366 We showed the efficacy of the presented method on highly complex real world examples, where it was able to significantly increase the homogeneity of the measured signal and attenuation fields. This is crucial if the attenuation field is used to correct secondary channels containing sparse structures within the anatomy. Based on these findings, we conclude that the presented attenuation correction approach is an important step towards the quantification of confocal microscopic data. http://www.biomedcentral.com/1471-2105/14/366

Additional file
Additional file 1: Supplementary material.