Segmental HOG: new descriptor for glomerulus detection in kidney microscopy image

Background The detection of the glomeruli is a key step in the histopathological evaluation of microscopic images of the kidneys. However, the task of automatic detection of the glomeruli poses challenges owing to the differences in their sizes and shapes in renal sections as well as the extensive variations in their intensities due to heterogeneity in immunohistochemistry staining. Although the rectangular histogram of oriented gradients (Rectangular HOG) is a widely recognized powerful descriptor for general object detection, it shows many false positives owing to the aforementioned difficulties in the context of glomeruli detection. Results A new descriptor referred to as Segmental HOG was developed to perform a comprehensive detection of hundreds of glomeruli in images of whole kidney sections. The new descriptor possesses flexible blocks that can be adaptively fitted to input images in order to acquire robustness for the detection of the glomeruli. Moreover, the novel segmentation technique employed herewith generates high-quality segmentation outputs, and the algorithm is assured to converge to an optimal solution. Consequently, experiments using real-world image data revealed that Segmental HOG achieved significant improvements in detection performance compared to Rectangular HOG. Conclusion The proposed descriptor for glomeruli detection presents promising results, and it is expected to be useful in pathological evaluation.


Background
The glomeruli in the kidneys act as a filtration barrier that retains higher molecular weight proteins in blood circulation. In various renal diseases, the glomerular filtration barrier can be damaged, resulting in protein leakage into urine, known as proteinuria. Therefore, the pathological changes in renal glomeruli of animal disease models can *Correspondence: katotsu@cs.gunma-u.ac.jp 1 Faculty of Science and Engineering, Gunma University, Kiryu-shi, Gunma 376-8515, Japan 2 Center for Informational Biology, Ochanomizu University, Bunkyo-ku, Tokyo 112-8610, Japan Full list of author information is available at the end of the article provide important information in screening compounds that target such diseases.
Our goal was to perform high-throughput detection of the glomeruli in highly enlarged microscopy images of animal disease models, whose sizes run up to the order of 10 8 pixels. Although there are existing studies about the automatic analysis of the glomeruli in microscopic images of the kidneys [1,2], the target images in these studies were obtained from human biopsy samples with relatively small sizes; therefore, they are not suitable for our purpose.
Compared to general object detection tasks, there are two particular obstacles in the case of glomeruli detection. The first obstacle arises from the non-rigid sizes and shapes of the targets in the images. Indeed, the glomeruli have a fixed size in vivo, although they swell to some extent in unfavorable situations such as hypertension [3] and diabetes [4]. In addition, the sizes of the glomeruli in a whole-kidney-section image could vary depending on where the cross-section was taken. The shape of the glomeruli is mostly spherical, making the boundaries circular. To obtain the boundaries, one might try to fit an ellipse to each glomerulus. However, this approach yields large estimation errors because each glomerulus is deformed to some extent.
The second difficulty arising in the glomerulus detection task is the high variation in staining intensity. On histological evaluation, immunohistochemistry is usually used to demonstrate the distribution and location of proteins in tissue sections. In our target images, sections were immunostained for desmin, a known glomerular injury marker. Therefore, some glomeruli are stained and some are not. As many glomeruli are partly stained, resulting in heterogeneously stained glomeruli, detection is more complicated. Furthermore, the stained tissues in the kidneys are not only from the glomeruli but also from other tissues such as the blood vessels.
To check the existence of a glomerulus at each location in a whole-kidney-section image, the sliding window technique is employed [5][6][7][8]. Using this procedure, a frame goes over the input image to check for the target object at every possible location; then, a descriptor of the sub-image is extracted.
Rectangular HOG (R-HOG) [9], a widely used and recognized efficient descriptor for object detection in the field of computer vision, is a potentially suitable candidate descriptor for glomeruli. It has the capacity to capture information about the magnitude of the gradients in the image. Therefore, this descriptor is robust to the change in intensities caused by the heterogeneity of the stained levels. Glomeruli are known to be composed of tightly packed cells, resulting in high gradients on images. Thus, a natural approach would be to use the magnitude of these gradients as features of the glomeruli. However, although we have previously attempted to directly exploit this attribute, we found the detection performance to be poor, resulting in many false positives and low recall. In addition to the magnitude of the gradients, their directions are also important to distinguish the glomeruli from the other tissues. Using R-HOG descriptors obtained from both the magnitude and the direction of the gradients, glomeruli detection performance results in recall values high enough to be useful for pathological evaluation. However, it appeared that R-HOG still has a considerable amount of false positives [10][11][12].
The high number of false positives in previous studies [10][11][12] can be ascribed to the condition that the standard HOG such as the R-HOG has a rigid block division.
Owing to this rigidity, there are instances when a block is inside the glomerular area in a case and outside in another. Thus, extracted features from each block contain large differences, and robustness for the deformations of glomeruli is lost. Although there are several other known local descriptors such as scale-invariant feature transform (SIFT) image features [13], Haar-like features [8], and local binary patterns (LBP) [14], these do not possess a solution to be robust for deformed glomeruli for similar reasons.
In this study, we introduce flexible block division to the HOG descriptor to improve the detection performance and to reduce the number of false positives. A new feature, which we refer to as the Segmental HOG (S-HOG) descriptor, has been proposed for glomerulus detection. The block division of S-HOG is based on the estimated boundary of the glomerulus that is obtained via a segmentation algorithm, which has also been developed in this study. This renders the division of blocks to be more adaptable than the rigid block division of R-HOG, and allows feature vectors to clearly differentiate between the inside and the outside of the glomerulus. Moreover, because blocks are always within the glomerular area, gradient information in the same block between two glomeruli is expected to be more similar. Experiments revealed that the number of false positives was halved, keeping almost all true positives when using S-HOG compared to the R-HOG.

Related works
Segmentation is an important step to extract S-HOG descriptors. Recent studies on segmentation of the glomeruli are few [1,2]. Nevertheless, there has been some research regarding segmentation of specific organs in general biomedical images, including region growing [15], level set method [16], and active contour model [17][18][19]. Most of these are semi-automatic, require the users' intervention, possess no guarantee of optimality [17][18][19], and are highly dependent on the initial solution provided by users as input. On the other hand, the segmentation algorithm developed in this study is ensured, theoretically, to obtain the optimal solution, producing high quality segmentation. In addition to the above-mentioned methods, more recent attempts include using deep learning [20]. Deep learning typically requires great computational and time resources, whereas the proposed algorithm can work even on a standard personal computer or a laptop.
The algorithm developed by Kvarnström et al. [21] is relevant to the proposed segmentation technique. Their algorithm for cell contour recognition is based on a dynamic program, where they first estimated the cell centers and constructed a ray from the center to each m direction (Fig. 1b), where m = 32. Then, they computed the boundary likeliness at n points on each ray, where they set n = 30. Their algorithm finds a smooth contour by taking a point on each ray to connect them. To this end, they posed the following discrete optimization problem: where L i : {1, . . . , n} → R denotes the boundary likeliness function obtained by the sliding window technique, and p i ∈ {1, . . . , n} (i = 1, . . . , m) is a location on the i-th line segment, where the line segment is discretized into n points numbered with a natural number. For instance, when p i = 1, the i-th vertex is at the endpoint of the i-th line segment closest to the center, and the vertex can move from this endpoint to the other endpoint with increasing values of p i . L i (p) is the boundary likeliness at the p-th location in the i-th line segment. The location p i on the i-th line segment is more likely to be the boundary with a larger L i (p i ) value.
To obtain an optimal solution, Kvarnström et al. presented two algorithms. The first algorithm poses n subproblems where in each sub-problem, the initial point, and the endpoint are the same. We shall refer to this algorithm here as the exhaustive dynamic program (EDP). Their second algorithm is a heuristic method that is faster than the first one, but possesses no guarantee for global optimality. In this study, we developed a new segmentation algorithm, referred to as divide & conquer dynamic program (DCDP). Compared to Kvarnström et al's algorithms [21], the DCDP algorithm has two advantages, as follows: not only is DCDP much faster than EDP, an exact optimal solution is always obtained; and the boundary likeliness function is trained with a machine learning technique to precisely estimate boundaries of the glomeruli.
One may consider another approach to the optimization problem (1), with a perspective that the problem is a formulation of finding a maximum a posteriori (MAP) estimation on a Markov random field (MRF) [22]. MRFs are a class of probabilistic models formulated on a graph. In the case of a problem (1), the graph has m nodes and forms a cycle. It is well known that the MAP estimation is efficient while using the Viterbi algorithm if the graph of the MRF is without cycles [23]. For graphs with cycles, many attempts such as LP relaxations [24] and max-product algorithms [25,26] have been tried to compute approximate MAP estimations. Although these algorithms possess no guarantee to obtain an exact MAP estimation, they may perform well in practice. In this study, we empirically show that DCDP is much faster than these algorithms for glomeruli detection.

Contributions
The contributions of this study are summarized as follows: • A new descriptor called S-HOG was developed to perform a comprehensive detection of hundreds of glomeruli in images of whole kidney sections. The new descriptor is equipped with flexible blocks that can be adaptively fitted to input images to acquire robustness for detection of glomeruli. • In our experiments, the S-HOG descriptor halved the number of false positives, a limitation of the existing state-of-the-art descriptor R-HOG, while keeping almost all true positives.
• A new segmentation technique referred to as DCDP offered high-quality segmentation outputs that were used for the construction of the blocks in S-HOG. The worst computational time of the new algorithm is equal to the fastest existing segmentation algorithm, and our experimental results reveal that the new algorithm performs overwhelmingly faster in practice.

Methods
In this study, a new descriptor, S-HOG, has been proposed for the detection of the glomeruli in kidney microscopic images. Segmentation of the glomeruli is needed to extract the S-HOG descriptor. For rapid detection of the glomeruli in highly enlarged microscopic images, prescreening is performed with R-HOG, which does not require prior segmentation. The proposed method consists of the following three stages ( Fig. 2): • the pre-screening stage, • the segmentation stage, and • the classification stage.
In each stage, a support vector machine (SVM) [27,28] is used with a different type of HOG descriptor, resulting in three SVMs in total. To obtain the S-HOG descriptor, we performed segmentation of the glomeruli from the subimages that passed the pre-screening (Fig. 2).
In what follows, we present the details of each stage, and then discuss how training datasets for each SVM are constructed and the materials used in the experiments. Finally, this section concludes with presenting a new segmentation algorithm, DCDP that is used for determining the blocks of S-HOG descriptors.

Pre-screening
In the pre-screening stage, candidate glomeruli are detected from a kidney microscopic image using the sliding window technique. The window size is set to 200×200 in our experiments. R-HOG features, which are 512dimensional vectors based on our selected parameter values, are extracted and judged by SVM, and non-maximal suppression is then performed to obtain the candidate glomeruli. This stage is exactly the same as in the method developed in our previous studies [10,11]. However, the subsequent two stages dramatically reduce the false positives detected by the method. Our experiment outputs, described in the 'Results and discussion' section, confirm that the non-maximal suppression successfully puts the center of the window in the glomerulus, which is crucial in the segmentation step.

Segmentation
Segmentation of the glomeruli is performed on subimages that passed the pre-screening. In the segmentation algorithm, the boundary of a glomerulus is represented by an m-sided polygon whose m vertices are restricted to lie on m line segments, respectively. The m line segments are placed uniformly, as outlined by the dotted lines in Fig. 1b where m = 36. To determine the location of the vertex on each line segment, the sliding window technique is employed again. 1 The window sweeps through the line segment and computes the boundary likeliness at consider naïvely locating the points that achieve the highest boundary likeliness on each line segment. However, this approach often yields an extremely zigzag boundary (Fig. 3a).
To obtain a smoother boundary, Kvarnström et al. [21] imposed a constraint that suppresses distant adjacent vertices, and they established the maximization problem (1). Although our implementation of the boundary likeliness is different, the formulation of the problem to find an m-sided polygon is the same as (1), where in our experiments, ς is set to ς = 1. Fig. 3 shows an example of the solution to the optimization problem. The m-sided polygon shown in Fig. 3b is the optimal solution to the maximization problem in (1). Compared to the solution without the constraint (Fig. 3a), it is apparent that the estimated boundary is formfitting to the true boundary by introducing the constraint. The details of the new algorithm for finding the optimal solution to (1) developed in this study is presented at the end of this section.

Computation of boundary likeliness L i (·)
The sliding window technique is employed in order to determine the vertices of the m-sided polygon described above. The window size in this stage is set to 30 × 15 pixels, and the windows sweep through the line segments (Fig. 1b). Each time the sliding window moves, a feature descriptor is computed from the window and is applied with linear SVM to compute the SVM score, which is what we refer to as the boundary likeliness L i (·) (Fig. 3). The SVM scores of m vertices, L i (1), . . . , L i (n), are then obtained for i = 1, . . . , m, and integrated in the maximization problem (1).
The HOG feature is adopted as the descriptor to compute the boundary likeliness. Each window is divided into three blocks, as shown in Fig. 1b. This division design is from an observation that some glomeruli are surrounded with a thick Bowman's capsule, and that the middle block is expected to capture this glomerular capsule. The statistics of nine discretely oriented gradients are computed in each block, producing a 27-dimensional feature vector.

Classification with the S-HOG descriptor
Candidate glomeruli obtained via pre-screening are classified using the proposed S-HOG descriptor. S-HOG exploits the glomerulus boundary located in the segmentation stage to generate 24 non-overlapping blocks, as shown in Fig. 4b.
Various types of glomeruli are observed on kidney microscopic images; some of them are surrounded by a thick Bowman's capsule. To effectively exploit this characteristic, the circle containing a candidate glomerulus is divided into the following three zones: the inner zone, middle zone, and outer zone. We divide the circle into eight disjoint sectors, and take the intersection of each zone and each sector to get 24 non-overlapping blocks (Fig. 4b), and gradients are then histogrammed for each block (Fig. 4d). In our experiments, we employed nine discretized oriented gradients, and SVM was applied to S-HOG feature vectors to discriminate between glomeruli and other regions.

Construction of training data
A total of three linear SVMs are used, one each for the pre-screening, segmentation, and classification stage, respectively. A training dataset is required for each of the three SVMs. Details on the construction of each training data set are given below. Training data for the classification stage Examples used in the training data for pre-screening are used again for training in the classification stage, but with a different set of features extracted via S-HOG. For each training data sample, the previously described segmentation algorithm estimates the boundary of the glomerulus. Based on the estimated boundary, the statistics of oriented gradients are computed to obtain S-HOG feature vectors. This procedure is done for both positive and negative examples, even though negative examples do not contain a glomerulus.

Materials
The images used in the present study had been generated in a previous study [10], and only an overview is given in this subsection. Twenty male 6-week-old SD and SDT rats were purchased from CLEA, Inc. (Tokyo, Japan) and were housed with a 12-h light-dark cycle and free access to water and chow.
At 16 or 24 weeks of age, SD and SDT rats (The number of rats is five for each group) were euthanized under ether anesthesia. Their kidneys were removed and immediately fixed in 10 % neutralized buffered formalin. The formalin-fixed kidneys were embedded in paraffin. For immunohistochemistry, kidney paraffin sections were deparaffinized and incubated overnight at 4°C with antidesmin mouse monoclonal antibody (Dako, Glostrup, Denmark) followed by horseradish peroxidase-conjugated secondary antibody (anti-mouse immunoglobulin goat polyclonal antibody; Nichirei, Tokyo, Japan). The sections were stained brown with 3,3'-diaminobenzidine. Whole slide images of the sections were obtained with Aperio Scan Scope XT (Leica Microsystems, Wetzlar, Germany). All animal experiments were performed in accordance with the Act on Welfare and Management of Animals and the institutional guidelines, and approved by the institutional Committee of Animal Experiments of New Drug Development Research Center Inc. (Hokkaido, Japan).
A total of 20 whole-kidney-section images were used in our experiments. The image sizes were 9, 849 × 10, 944 pixels in average. Each image was from one of four groups: 16-week-old SD rats, 16-week-old SDT rats, 24-weekold SD rats, and 24-week-old SDT rats. Henceforth, for simplicity, we will refer to them as 16SD, 16SDT, 24SD, 24SDT, respectively, each group containing five images. For performance evaluation, we manually annotated every glomerulus in the images. We divided the image set into five subsets: Set A, Set B, Set C, Set D, and Set E. Each subset consists of a 16SD image, a 16SDT image, a 24SD image, and a 24SDT image. For assessment of detection performance, the position of every glomerulus in the images is annotated, and for evaluation of segmentation performance, the areas of the glomeruli in Set A and Set B are located manually using a graphics software.

Divide & conquer dynamic program (DCDP)
Herein To express this idea mathematically, let us define Dynamic program (DP) cannot solve this problem directly owing to the existence of the constraint |p m − p 1 | ≤ ς. To use DP, we consider finding the maximizer of J(p) from a relaxed region S L (N n ), where S L (I) is defined as where the operator + denotes that for any two sets I and J , I The strategy of DCDP is to first find the solution of the relaxed problem, and then check the feasibility: if p 0 ∈ S(N n ), then p 0 is the optimal solution of the original problem (1). If p 0 ∈ S(N n ), then the set N n is divided into I 1 and I 2 (i.e. I 1 ∪ I 2 = N n ), and the following two sub-problems are solved: and Notice that the original feasible region, S(N n ), is the sum of the two regions, S(I 1 ) and S(I 2 ). Therefore, we can take either of the two solutions, p 1 and p 2 , whichever has the larger objective value. DCDP employs a divide and conquer approach that repeatedly applies the above strategy to sub-problems. The basic approach of DCDP is summarized in Algorithm 1. Invoking the function DCDP_Basic(N n ) yields the optimal solution for the original problem. Here, the function (I 1 , I 2 ) := Split(I 0 ) divides the set I 0 into two exclusive non-empty subsets, I 1 and I 2 . The first step p 0 ∈ argmax p∈S L (I 0 ) J(p) can be performed in O(nmς) computational time. An instance of the dynamic program is given in Algorithm 2. Note that p 0 ∈ S(I 0 ) is always ensured if the cardinality of I 0 is one, because the relaxed region is reduced to the unrelaxed region (i.e. S({h}) = S L ({h})). The function DCDP_Basic is invoked, at most, (2n − 1) times. This implies that the computational time in worst cases is O(n 2 mς). As will be shown in the 'Results and discussion' section, we empirically found that the number of invoking the function recursively is much smaller than (2n − 1).

Algorithm 2 O(nmς) Dynamic program for max p ∈S L (I) J(p)
Input: I ⊆ N n . Output: p ∈ argmax p∈S L (I) J(p) 1: Initialize all entries in the n × m matrix Q with −∞. 2: for j ∈ I+[ −ς, +ς] ∩N n do 3: Q(j, 1) := L 1 (j); 4: end for 5: for t = 2, . . . , m do 6: if t < m, then I t := N n else I t := I; 7: for i ∈ I t do 8: In the text below, the pruning steps and the resulting accelerated DCDP algorithm are detailed. Mathematical proof that the DCDP algorithm is guaranteed to obtain an optimal solution is also given. This property is favorable compared to MAP estimation methods-such as LP relaxation-that does not always achieve an optimal solution. For the implementation of Split(I 0 ), we considered the following three schemes: Half Split, Max Split, and Adap Split. These splitting schemes are described at the end of this section.

Pruning
Pruning can accelerate the DCDP algorithm. Consider the case where the lower boundary is , such that Based on this fact, the pruning step is added to obtain Algorithm 3, and we have the following lemma: The following relationships will be used in this proof: where the labels (eqA), (ineqA), and (eqB) are used to distinguish these equalities and the inequality in later descriptions of this proof. The inequality follows from the fact that S(I 0 ) ⊆ S L (I 0 ), while the second equality eqB follows from S(I 0 ) = S(I 1 ) ∪ S(I 2 ).
We will prove the lemma by induction. For the case where card(I 0 ) = 1, observe that S L (I 0 ) = S(I 0 ), implying that If J(I 0 ) = J(p L ) < , then by Rule A in the algorithm, J 0 = −∞ and J 0 < 0 = . On the other hand, if J(p L ) ≥ , then as p L ∈ argmax p∈S(I 0 ) J(p), we have p L ∈ S(I 0 ). Thus, by Rule B, we have p 0 = p L and J 0 = 0 ≥ . Therefore, the lemma is true for card(I 0 ) = 1.
Finally, from Lemma 1, we can derive the following theorem, an important theoretical result of this study: If card(I 2 ) = 0, the entry h (I 0 ) is moved from I 1 to I 2 . A heuristic process swaps I 1 with I 2 if card(I 1 ) > card(I 2 ) is applied.
, and The smallest entry in I 2 is moved to I 1 if card(I 1 ) = 0, and the largest entry in I 1 is moved to I 2 if card(I 2 ) = 0. A swapping heuristic process similar to that used in Max Split is then applied.

Why Adap split is better
Adap Split is expected to be the smartest heuristic process among the three splitting schemes. To support this claim, we illustrate the process of DCDP on a small toy problem with (n, m, ς) = (8, 12, 1), as shown in Fig. 5. The original problem and the relaxed problem are depicted in Fig. 5a and b, respectively. When running DCDP(N 8 , −∞), where I 0 = N 8 , it was observed that p L,0 := argmax p∈S L (I 0 ) J(p) ∈ S(I 0 ). Thereby the set I 0 is divided into I 1 and I 2 to produce two new branches DCDP(I 1 , −∞) and DCDP(I 2 , 1 ) where 1 will be computed via DCDP(I 1 , −∞).
On the other hand, if Half Split is applied, the set I 0 = N 8 is divided into I 1 = {5, 6, 7, 8} and I 2 = {1, 2, 3, 4} ( Fig. 5e and f). In DCDP(I 1 , −∞), the obtained solution of the relaxed problem is p L,1 := argmax p∈S L (I 1 ) J(p) = p L,0 , which is, again, not in S(I 1 ), leading to further branching along this sub-problem (Fig. 5e). In fact, in our experiments discussed in the 'Results and discussion' section, it was observed that both Half Split and Max Split frequently encounter cases where p L,1 = p L,0 or p L,2 = p L,0 . Whenever p L,1 = p L,0 , additional new branches for the divisions of I 1 are produced because p L,1 = p L,0 ∈ S(I 0 ), leading to p L,1 ∈ S(I 1 ) ⊂ S(I 0 ). Similarly, new branches for the divisions of I 2 are also generated when p L,2 = p L,0 .
In brief, the Adap Split performs better because p L,1 = p L,0 or p L,2 = p L,0 is less likely to happen in this scheme, thus resulting in less branches.

Results and discussion
In this section, the detection performance is demonstrated by showing the experimental comparisons between S-HOG and R-HOG [10,11].
As described in the previous section, our method has three stages: pre-screening, segmentation, and classification. Each stage uses its own SVM trained with a hyper-parameter C. In the classification stage, a threshold θ is used to classify an example; if the SVM score is over the threshold θ, the example is predicted as positive, otherwise, negative. For the pre-screening and classification stages, Set A was used for training SVM, and Set B was used for determining the optimal combination of (C, θ). Sets C, D, and E were used for performance evaluation. SVM for the segmentation stage provides us with the boundary likelihood function. The regularization parameter C for the SVM is determined via the holdout method within Set A. Seventy percent of the glomeruli in Set A are randomly selected for training, and the rest are used for validation. The resulting parameter values were (C, θ) = (10, 2) for pre-screening, C = 10 for segmentation, and (C, θ) = (10, −1.5) for classification. Figure 6 illustrates examples of detected glomeruli. In the two images, the candidate glomeruli that passed through pre-screening are depicted with rings that represent the boundaries estimated in the segmentation stage. The numbers printed above the rings are the scores produced by SVM in the classification stage. Candidate glomeruli with SVM scores below θ = −1.5 are excluded from the final detection results. The excluded candidates are depicted with blue rings, and the remaining glomeruli with red rings. It can be observed that non-glomerulus areas are excluded effectively and that true glomeruli are estimated correctly.

Detection performance
For quantitative assessment of detection performance, true positives, false positives, and false negatives have to be defined. True positive glomeruli (TPG) are identified as correctly detected glomeruli, false positive glomeruli (FPG) are wrongly detected glomeruli, and false negative glomeruli (FNG) are the ones that could not be detected.
From the definitions of TPG, FPG, and FNG, we can compute for the three widely used performance measures: F-measure, precision, and recall. Precision is the ratio of TPG to the detected glomeruli (i.e. TPG/(TPG + FPG)), recall is the ratio of TPG to the true glomeruli (i.e. TPG/(TPG + FNG)), and F-measure is the harmonic mean of the Precision and Recall. Figure 7 shows the plots of the F-measure, Precision, and Recall for each testing image. S-HOG achieved an average of 0.866, 0.874, and 0.897 for F-measure, Precision, and Recall, respectively, whereas R-HOG obtained 0.838, 0.777, and 0.911, respectively. While applying detection methods to pathological evaluation, Precision is more important than Recall [11], and in this study, S-HOG achieved considerably higher Precision with a small sacrifice in Recall. A two-sample t-test was performed to assess the statistical differences. While no statistical difference of Recall can be detected (P-value = 3.47 · 10 −1 ), the differences among F-measure and Precision are significant (P-values = 1.34 · 10 −3 and 3.75 · 10 −5 , respectively).

Segmentation performance
Herein, we discuss the performance of the segmentation algorithm. While the main purpose of the proposed method is detection, the proposed DCDP algorithm used for obtaining estimated segmentations may also be applied in some way in studies needing subsequent pathological evaluation [11]. To quantify the accuracy of the estimated areas within the predicted boundaries, 993 annotated glomeruli in Set B were used. True positive area (TPA), false positive area (FPA), and false negative area (FNA) were defined as follows: TPA is the intersection of the true area and estimated area; FPA is the relative complement of the true area in the estimated area; and  FNA is the relative complement of the estimated area in the true area. For each glomerulus and its estimated area, F-measure, Precision, and Recall can be obtained by counting the pixels in the TPA, FPA, and FNA. The histograms of the F-measure, Precision, and Recall are plotted in Fig. 8, where the frequency is normalized so that the integral is one. Among the glomeruli, 90.1 % are estimated to have F-measures more than 0.8, ensuring reliable assessment of the medicinal effect for drug development.
The computational time of the new segmentation algorithm, DCDP, is compared with that of EDP. The two algorithms solve the same optimization problem, and both algorithms always find the same optimal solution. DCDP and EDP are implemented in C++ language, and the runtimes are measured on a Linux machine with Intel(R) Core(TM) i7 CPU and 8-GB memory. First, the number of times when the O(nmς) DP routine was invoked, which we denote by n dp , is counted using the annotated glomeruli in Set B. Figure 9a shows the box-plot of n dp for all methods. While the value of n dp for EDP is always n, the values for DCDP depend on the input images and the splitting schemes, Half Split, Max Split, and Adap Split. For 46.32 % of glomeruli, the optimal solutions are found within the first DP routine (i.e. n dp = 1). The medians of the n dp 's when using Half Split, Max Split, and Adap Split are 5, 3, and 3, respectively. In other words, the medians of the depth of the branching tree for each scheme are 3, 2, and 2, respectively, and the respective 75th percentiles of n dp 's are 11, 7, and 5. For Adap Split, there is no case where n dp is larger than n, whereas the number of glomeruli with n dp > n are 4 (0.40 %) and 16 (1.61 %) for Half Split and Max Split, respectively. This implies that Adap Split is the best heuristic process among the three splitting schemes. As considered in 'Methods' section, Adap Split produces the same solution p L in the branches less frequently when compared to the other schemes. In Half Split and Max Split, the frequency (# of glomeruli) of cases where the solution p L in the top branch appears again in the second branches is 414 and 314, respectively. These numbers are much larger than the frequency obtained by Adap Split, which is only 97. This explains why Adap Split is faster. The actual runtime of each method is depicted in Fig. 9c, where the medians of the computation times are 0.0866, 0.0570, 0.0560, and 0.418 msec, and the 75th percentiles of the computational times are 0.171, 0.117, 0.0856, and 0.426 msec for Half Split, Max Split, Adap Split, and EDP, respectively. As these values are proportional to the n dp 's, the ratios among the runtimes are almost the same as the ratios among the n dp 's. These results conclude that the proposed algorithm DCDP achieves an exact optimal solution much more efficiently than the existing algorithm EDP while solving the same problem, and that the Adap Scheme is the fastest splitting scheme.
The polygon model employed in this study can be reformulated as an MRF where neighboring vertices have an interaction. This perspective allows us to solve problems (1) by means of algorithms for finding a MAP estimation of MRF models. For comparison with DCDP, we examined the MPLP method [26], which is a state-ofthe-art algorithm for MAP estimation of MRF. MPLP is a block coordinate-descent algorithm for minimizing the dual objective of LP relaxation. For our segmentation problem (1), the dual objective is given by where λ := λ(p i , i, i) ∈ R | p i ∈ N n , i ∈ N m ∪ λ(p i+1 , i, i + 1) ∈ R | p i ∈ N n , i ∈ N m is a set of dual variables, and we used p m+1 as the alias of p 1 for simplicity of notation. Both λ(p 1 , 0, 1) and λ(p m+1 , m, m + 1) are aliases for λ(p 1 , m, 1). Each variable block in the block coordinate descent is an edge. Hence, our segmentation problem (1) has m variable blocks, and the update rule for i-th edge is given by (L i+1 (p i+1 ) + λ(p i+1 , i, i + 1)) , and λ(p i+1 , i, i + 1) := − 1 2 (L i+1 (p i+1 ) + λ(p i+1 , i + 1, i + 1)) The time complexity of one iterate is O(nmς), which is equal to that of Algorithm 2, a dynamic program for solving each sub-problem used in DCDP. Typically, larger variable blocks reach the convergence faster in the block coordinate-descent algorithm. It can be seen easily from the update rule of i-th edge that the dual variables of (m/2) odd-numbered edges are updated simultaneously and those of (m/2) even-numbered edges are updated simultaneously. Then, the number of blocks is reduced to two. This algorithm is referred to as MPLP+. We actually implemented both MPLP and MPLP+ in C++ language and applied it to each of the 993 glomeruli images. MPLP and MPLP+ successfully obtained the optimal solutions for all the images, although MPLP and MPLP+ are not guaranteed theoretically to achieve the optimal solution. The number of iterates and the computational times are depicted in Fig. 9b and d, respectively. Although MPLP+ has larger variable blocks than MPLP, it did not significantly improve convergence. Furthermore, it turned out that both methods are too slow to be compared with DCDP.

Conclusions
In this study, a new descriptor, Segmental HOG, was proposed for specific glomeruli detection in microscopy images. The descriptor was based on the boundary of the glomeruli to acquire robustness for variations in intensities, sizes, and shapes. A new segmentation algorithm, DCDP, was developed to locate the boundary of possible glomeruli. Empirical results show significant improvement compared to the state-of-the-art descriptor, Rectangular HOG, for the task of glomerulus detection in microscopy images. Moreover, experimental results reveal that DCDP is much faster than the existing segmentation algorithm EDP.
Several possible uses of the proposed method can be considered. For instance, an appropriate size of the sliding window should be chosen if the proposed method is applied to microscopic images with different resolutions. In addition, while the boundary likeliness function is the same for any direction in the segmentation algorithm, different boundary likeliness functions can be used for detecting other organs depending on the orientation. For the block division of the S-HOG descriptor, 24 blocks were used in this study, as depicted in Fig. 4c, but a different number of blocks with a different division can be used for other applications. Future studies include exploring such extensions in other applications of Segmental HOG.