This section reviews the developed computational core for lesion segmentation. Figure 1 shows its processing steps. Each of these steps are detailed in the following subsections.
Filtering
We use Perona-Malik filtering [20] method that aims to smooth noises while preserving significant information, in our case edges. Perona-Malik filtering is chosen since it preserves edges. Formal representation of this filtering method is as follows;
$$ g(\nabla(I))=\frac{1}{1+\sqrt{1+\frac{{\nabla(I)}^{2}}{\gamma^{2}}}} $$
(1)
where g∇(I) represents the diffusion coefficient, ∇(I) gradient map of the image I. As can be inferred from the Eq. 1, ∇(I) and g are inversely proportional to maintain the notion of Perona-Malik method. γ is a constant to control the sensitivity against gradients on the image domain. Diffusion process will be declined at the regions where ∣∇I∣≫γ. Without smoothing, initial contour is trapped by noise(s) (weak edges) and cannot delineate the lesion border. After denoising completed, SPH kernel is used to overcome active contour leaking problems, especially for the fuzzy borders of skin lesions.
A new local edge extraction method: SPH
SPH is an interpolation method which is used for numerically solving diffusion equations. It is used in various applications such as highly deformable bodies simulations, lava flows, and computational fluid dynamics. Its principle is based on dividing the fluid (medium) into a set of discrete elements which are called particles. Equation 2 describes this interpolation where h is the length of smoothing function W, and r is the descriptor of the medium entity, r ′ is the adjacent entities in the range of h. There are many kernel functions defined in the literature [21, 22]. One of them is Monoghan’s cubic spline [21] as given in Eq. 2 that provides the temperature (in its specific case) A, at position r relying on the temperatures of all particles in a radial distance h.
$$ A_{t}(r)= \int \mathrm{A}(r^{\prime}) \mathrm W\{ |r-r^{\prime}|,h\} \mathrm{d}r^{\prime} $$
(2)
In Eq. 2, the contribution of a particle to a physical property are weighted by their proximity to the particle of interest and its density. Commonly used kernel functions deploy cubic spline and Gaussian functions. Cubic spline is exactly zero for particles located at a distance equals two times of the smoothing length, 2h. This decreases the computational cost by discarding the particles’ minor contributions to the interpolation.
To expand the representation of SPH kernel in physics, let us take another particle j and associate it to a fixed volume ΔVj with a lump shape, which leads determining the computational domain with a finite number of particles. Concerning the mass and density of the particle, the lump volume can be rewritten as the ratio of mass to density mj/pj [23]. Mathematical representation is given in the following equation,
$$ \mathrm{A}(r)= \sum\limits_{j} m_{j} \frac{A_{j}}{p_{j}}W\left(\left| r-r_{j} \right|\right),h) $$
(3)
where A is any quantity at r; mj is the mass of particle j; Aj is the value of the quantity A for particle j; pj is the density of particle j; r is spatial location; and W is the kernel function. The density of particle i, pi can be expressed as in the Eq. 4.
$$ {\begin{aligned} \rho_{i}= \mathrm{(\rho(r_{i}))}&=\sum\limits_{j} m_{j} \frac{p_{j}}{p_{j}}W\left(\left| r-r_{j} \right|\right),h)\\&= \sum\limits_{j} m_{j} W\left(\left| r-r_{j} \right|\right),h) \end{aligned}} $$
(4)
where the summation over j covers all particles. Since m is a scalar, gradient of a quantity can be found easily by the derivative ∇ as seen in Eq. 5.
$$ \nabla \mathrm{A} (r)= \sum\limits_{j} m_{j} \nabla W\left(\left| r-r_{j} \right|\right),h) $$
(5)
Kernel approximation
A feasible kernel must have two following properties,
$$ \int\limits_{\Omega}\mathrm{W(r,h) }= 1 $$
(6)
and
$$ {\lim}_{h \to 0} \mathrm{W} (r,h)=\delta(r) $$
(7)
where δ is the Dirac Delta function.
$$ \delta(r)= \left\{\begin{array}{ll} \infty, & \,\, \text{if}\ a=0 \\ 0, & \,\, \text{otherwise} \end{array}\right\} $$
(8)
Kernel must be an even function and greater than zero all time [23]. These cases are expressed formally as in the following;
$$ \mathrm{W(r,h)} \geq 0 \hspace{1 cm}\text{and}\hspace{1 cm} \mathrm{W(r,h)}=\mathrm{W(-r,h)} $$
(9)
Several kernels for SPH are proposed including Gaussian, B-Spline, and Q-spline [21, 22, 24]. Even though, Q-spline is considered the best in [24] in terms of accuracy, it is computationally expensive due to the square root computations. We propose to use 6th degree polynomial kernel suggested by [24] as the default kernel, which is expressed below,
$$ W_{default}(r,h)=\frac{315}{64*\pi*h^{9}}{\left(h^{2}- \left| r \right|^{2}\right)^{2}} $$
(10)
with the gradient,
$$ \nabla W_{default}(r,h)=\frac{945}{32*\pi*h^{9}}{\left(h^{2}- \left| r \right|^{2}\right)^{2}} $$
(11)
Once SPH is applied to dermoscopy images, it generates all local edge features. Figure 2 illustrates edge features derived by SPH on a dermoscopy Image. In our experiments, we empirically selected 1 for h, and 6th degree kernel for interpolation. Obtained SPH map will be used as the edge indicator function in the lesion border segmentation. Formal representation of the edge indicator function is given as in the Eq. 12,
$$ g=\frac{1}{1+\mid \nabla (G_{\alpha}*I) \mid^{p}}, p=1,2 $$
(12)
where I is the image, G is denoising function, and ∣∇(Gα∗I)∣p is the edge map produced by image gradients. In this paper, we used SPH formulations that are to calculate surface normals, instead of image gradients. Edge indicator functions are commonly represented by g as shown in Eq. 12. Next subsection reviews the mathematical pipeline that robustly minimizes the obtained g function.
Probability map for stronger edges
Probability map
To address the drawback seen at traditional ESFs in edge-based AC, this study introduces a computational pipeline which is based on constructing a robust ESF that utilizes probability scores (between 0−1) rather than predicted class labels provided by a classifier as given in [18]. Probability scores indicate whether a pixel is foreground or background pixel. These scores are computed in O(n) where n is the number of pixels. Whereas Pratondo et al. [18] uses fuzzy KNN or SVM classifiers to predict whether a pixel is a foreground or background pixel in O(n2). Classifier scores (between 0−1) at boundary pixels tend to be close to zero. So far, considerable amount of work has been done to have ESF collaborate with the likelihood of pixels (whether a pixel belongs to background, foreground, or edge) to avoid contour leakages through the border. Pratondo et al. [18] extended methods of [25, 26] that rely on only class probability using Bayes rule, by utilizing the probability scores from fuzzy KNN and SVM classifiers.
We adopted the image segmentation approach studied in [27] that combines pixels ′ Gaussian probability distribution (in terms of being a foreground or background pixel) with their geodesic distances to patches selected on foreground and background. Even though this method fails in the dermoscopy images displaying lesion with weak or fuzzy edges, it provides reliable results for mapping probability of pixels that estimates whether they are background or foreground. We approach this feature of [27] such that we minimize the probability matrix where lesion edges are located. Then, we multiply the minimized matrix with the edge indicator function generated by SPH to have a more robust edge indicator function in the segmentation. In our case, object (foreground) is skin lesion and the background is healthy tissue. Figure 3 shows a comparison of segmentation results of methods which use conventional gradients [28], the approach proposed in [18], and our approach to form edge indicator function, respectively.
First step of the probability map generation is to have the regions (boxes) from foreground and background. Boxes (patches) in size of (average) 70x90 collect pixel samples from foreground (lesion) and background (healthy tissue) to create the color models. Pixels on an image will have a value from 0 to 255 at any channel. In probability computation, each of these values is assigned to a probability range between 0 and 1, and the sum of these probabilities for each pixels should be 1. Formal representation is shown as in the Eq. 13,
where p represents the probability that the pixel has an intensity of k. Hence, all these can be expressed by a sum as in the Eq. 14.
$$ \sum p(I(x,y)=k)=1 $$
(14)
To perform background subtraction, let us label the boxes as l1 and l2 respectively, where l1 is from background Ω1 and l2 is from foreground Ω2 of the image I. We can approximate the probability density function (PDF) using a Gaussian fitting function shown as in the Eq. 15,
$$ p(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{{x-\mu}^{2}}{2\sigma^{2}}} $$
(15)
where, μ and σ represent mean and standard deviation, respectively, estimated on the histogram of data stored in the l1 and l2. Figure 4 shows histogram of background and foreground patches obtained from the image displayed in Fig. 5. Using the generic formula given in Eq. 16, the likelihood (in terms of being foreground or background) of a pixel x on the channel C is given in the Eq. 16.
$$ P^{i}_{1|2}(x)= \frac{p^{i}_{1}\left(C_{i}(x)\right)}{p^{i}_{l}\left(C_{i}(x)\right)+p^{i}_{2}\left(C_{i}(x)\right)} $$
(16)
where pi represents the PDF of ωj on channel C, i is the channel number, and in our case j=1 and 2 since we have only two labels (foreground and backgorund). Additionally, a weight can be assigned to each channel, then probability of a pixel x assigned to l1 can be computed as in Eq. 17.
$$ P^{i}_{1|2}(x)= P_{r}\left(x \in l_{1}\right)=\sum\limits_{i=1}^{N_{c}} w^{i}P^{i}_{1|2}(x)] $$
(17)
where wi represents the weights which are to impose the channel (i∈Nc) capacity in terms of abstracting the foreground from background, and Nc represents the channel number.
However, image segmentation merely relies on PDF since probability map is potent to fail. As seen in Fig. 5, pixel X, which locates inside of the object of interest, has similar intensity features with the background. In order to address this problem, [27] combined the PDF distribution with geodesic distances of each pixels to these boxes. Following subsection reviews the geodesic distances concept offered by [27].
Geodesic distances
Geodesic distances are weighted distances. To expand, assume that going to city of B from city of A takes two hours, and distance between A and B is 100 km. Whereas, going to city of C from city of A takes four hours while the distance between A and C is only 50 km. Since C is a city of another country, traveling from A to C requires more effort. Hence, the weight of passing a country border increased the travel time from city A to city C.
Likewise, weighted geodesic distance of each pixels to background and foreground boxes can be computed by Eq. 18 where W is the weight, s1 and s2 represent the boxes, and d represents all possible distances from pixel X to background and foreground boxes (see Fig. 4). If the weight W is high, then the distance d will be high.
$$\begin{array}{*{20}l} d(s)(s_{1},s_{2}) = min\underset{s_{1},s_{2}}{C} \int_{C_{s_{1},s_{2}}}{Wd}_{s} \end{array} $$
(18)
Here, pixel assignment to background or foreground is performed by comparing the minimum distance to the background and the minimum distance to the foreground. Let us select a pixel X; if this pixel’s minimum distance to the background is less than the minimum distance to the foreground, then this pixel is assigned to background, or vice versa.
Geodesic image segmentation selects the gradient of PDF, ∇PF,B(x), as the weight W. That means, spatial connectivity between observed pixels and pixels of boxes, is constrained by the change of probability (see Fig. 4) as given in Eqs. 19 and 20.
$$\begin{array}{*{20}l} W = |\nabla P_{F,B}(x).\overset{\rightarrow}{C^{\prime}}_{s_{1},s_{2}(x)} | \end{array} $$
(19)
$$\begin{array}{*{20}l} D_{l}(x) = \underset{s\in\Omega_{1}}{min}\hspace{0.1 cm} d(s,x),l \in\{F,B\} \end{array} $$
(20)
For instance, if ∇PF,B(x) applies more weight, which means more probability change along the path, in Eq. 19, that yields increase in distance and decrease the possibility of being foreground. Consequently, pixel labeling is conducted by comparing minimum of the distances of DF(x) (foreground) and DB(x) (background) which are represented in Eq. 20. All of these computations toward generating the probability map are performed in linear time. Interested readers are referred to [27], for more details.
Note that resulting probability maps are used to further strengthen edge indicator function using the formulations given in [18]. Therefore, the edge indicator function will be more robust for the active contour guided image segmentation. Next subsection reviews how we minimize the obtained pixel probability matrix.
Minimizing probability matrix
Pixels probability matrix suggests that probability values which are close to 0 and 1 represent background and foreground, respectively; whereas probability values which are close to 0.5 represent edges according to [18]. Pratondo et al. [18] also tries to have more robust edge indicator function to avoid leaks. They applied three different formulas on probability matrix to minimize the values which represent the edges. Finally, they multiplied the minimized probability matrix with their edge indicator function generated from image gradients. In terms of minimizing probability matrix in the edge pixels, we adopt the following formula expressed in Eq. 21 from [18] and replace 0.5 with 0.7 based on our experiments in dermoscopy images.
$$ prob(s)=\left.\left(2(s-0.7)^{2}\right)\right) $$
(21)
where s represents the probability matrix for foreground. And new edge indicator function, gnew can be obtained as in the following equation, Eq. 22,
$$ g_{new}=g*prob $$
(22)
The equations given in Eqs. 21 and 22 help us minimize the edge indicator function even at the poorly defined object boundaries for the active contours guided segmentation. Therefore, contour evolution will terminate at the desired boundaries.
Figure 6 illustrates the obtained edge map which is a matrix in the image size and will be passed to energy function in the active contours configuration. Note that Fig. 6e and f show more robust edges compared to Fig. 6c and d. Figure 6e and f are created using geodesic probability map. Figure 6b is the conventional edge map that relies solely on image gradients. The enhancement provided by geodesic probability map can be realized qualitatively by naked eye, we also proved it quantitatively using 100 images and displayed the results in the results section. Moreover, training KNN and SVM for the results displayed in Fig. 6c and d are computationally heavy; while in our case, we used geodesic probability map which is obtained in linear time. Average time on generating new edge indicator function for the used data is also discussed in the results section. It is arguable that we continued segmentation after obtaining the outcome shown in Fig. 6e and f. Notably, it is not a binary image that can be used as a segmentation result, and still requires to be processed for abstracting the lesion from background. To address this challenge, we solved the segmentation function (in the spirit of active contours) using level sets. Next subsection reviews the level set evolutions.