Dataset and computing environments
We used two datasets for testing. The first is provided by Wang et al. [11], and it consists of genetic labeling chandelier cells imaged by the fMOST system with a resolution of 0.2 × 0.2 × 1 μm3. The second is provided by Zhang et al. [23] and Jiang et al. [24], and it consists of pyramidal neurons at the same resolution. Two mice were used, one for the chandelier cells and the other for the pyramidal neurons. The mice were anesthetized deeply using a 1% solution of sodium pentobarbital and sacrificed by injection of ketamine. The semiautomatic neuron tracing used in the quantitative evaluation was performed with Amira software (version 6.1.1; FEI, Mérignac Cedex, France); the code of this method was written in MATLAB (R2019a, MathWorks) and run on a graphics workstation (Xeon CPU E5–2687 v3).
Workflow of the proposed method
Figure 1 depicts the computational workflow, including its three stages. The skeleton optimization method is the second stage, and the first and third stages are the pre- and postprocessing of the neuronal morphology, respectively.
The first phase is to extract a branch from the digital skeleton of the tree structure (Steps 1–3 in Fig. 1). Specifically, we linearly interpolate the digital skeleton and select the branch to be adjusted. The second stage involves fine-tuning the points derived from the digital skeleton to the centerline of the neuronal fibers from the images (Steps 4–5 in Fig. 1). Specifically, we initially adjust the digital skeleton points that are far from the centerline of the neuronal fibers based on the image grayscale. Subsequently, provided that the neuron fiber shape is smooth and continuous, we describe this physical property with two shape restrictions. One restriction is designed based on the image grayscale: the difference in the signal brightness between the local voxels around the neuronal fiber centerline must be small. The other is designed based on geometry: the angle formed from three continuous points on the neuronal fiber centerline must not be extreme. These two restrictions are designed to finely adjust the digital skeleton points to the neuronal fiber centerline. Subsequently, we join the branches into a tree structure according to the connected relationships, which comprises the third stage (Step 6 in Fig. 1). Finally, the digital skeleton yields a digital neuron morphological file.
With regard to the above three stages, the second stage is a looping structure. See the flowchart in Fig. 1 for details.
Skeleton point adjustment based on grayscale
According to the grayscale distribution of the voxels around the neuronal fibers, a preliminary adjustment of the digital skeleton points that are far away from the neuronal fiber centerline shown in the image is performed. We take advantage of the idea of using the mean shift to find the location of this centerline point. Because of anisotropy in the image stacks, we linearly interpolate the three-dimensional images in the Z direction to obtain images with the same resolution in the X, Y and Z directions. For a skeleton point p0(x0, y0, z0), take an adjustment in the Z direction as an example. We keep the X and Y coordinates unchanged and calculate the convolution of the voxels in the local range centered on p0 and the Gaussian kernel function in the Z direction. The point that has the largest convolution value is regarded as the centerline point p1 of the neuronal fibers from the images.
$$ {p}_1=\arg \underset{p}{\max } Gau\otimes G\left({p}_0\right) $$
(1)
Similarly, the X and Y coordinates of the skeleton point p1 are adjusted to the centerline point to finally obtain the adjusted point p2.
Skeleton point adjustment based on shape restrictions
To describe the tortuous neuronal fiber morphology with more accurate digital skeletons, it is necessary to fine-tune the skeleton points after the preliminary adjustment. The neuronal fibers are smooth and continuous, and this neuronal fiber property is shown in the images. Considering this property, for the point p2, we designed an angle restriction and brightness restriction to judge whether p2 needs further fine adjustment. Specifically, for p2, we determined its parent point pf, the parent point of pf, pff, and the child point pc. The point p2 forms an angle ∠pfp2pc with pf and pc. Because the neuron fibers are smooth, the angle ∠pfp2pc tends to be close to 180°. We set the critical value of the angle αstd as an obtuse angle (135° is chosen in this dataset). If ∠pfp2pc < αstd, p2 is considered to require further adjustment. For the brightness restriction, we compared the signal strength \( {G}_{p_2} \) at point p2 with the signal strength threshold Gstd, thus designing Gstd as follows:
$$ {G}_{std}=\max \left(\left({G}_{p_f}+{G}_{p_2}+{G}_{p_c}\right)/3,\kern0.5em {{G_B}_{ranch}}_{\_ mean}+{G}_{Branch\_ stdev}\right) $$
(2)
Here, \( {G}_{p_f} \) is the signal strength of pf, taking the grayscale value of pf in the same way as for \( {G}_{p_2} \) and \( {G}_{p_c} \). GBranch _ mean and GBranch _ stdev are the mean and standard deviation of the signal strength values of all points on the current branch. If \( {G}_{p_2}>{G}_{std} \), the signal strength of p2 is in accordance with that of the current branch. However, if \( {G}_{p_2}<{G}_{std} \), p2 is identified as a point that requires further adjustment. If its signal strength is small, this is contrary to the fact that the signal strength at the center of the neuronal fiber in images tends to be relatively strong.
According to these restrictions, we set out to find the centerline points of the neuronal fiber from the images. Prior to adjusting p2, because p2 itself may be a biased point, we have to judge whether we need to find a new starting point other than p2 for subsequent computation. Here, if ∠pfp2pc > αstd and \( {G}_{p_2}<{G}_{std} \), then there is only a problem with the brightness of point p2. In this situation, the starting point p2 _ new will be equal to p2. If ∠pfp2pc < αstd, this indicates that there is a serious deviation in the angle pertaining to p2, and such a point is likely to have a problem with both brightness and angle. Therefore, we have to find a new starting point p2 _ new. The assumption here is that \( \left\Vert \overrightarrow{p_2{p}_f}\right\Vert <\left\Vert \overrightarrow{p_2{p}_c}\right\Vert \). In the \( \overrightarrow{p_2{p}_c} \) direction, with p2 as the starting point, we draw \( \left\Vert \overrightarrow{p_2{p}_{rc}}\right\Vert =\left\Vert \overrightarrow{p_2{p}_f}\right\Vert \) and obtain a point prc on \( \overrightarrow{p_2{p}_c} \). The intersection of the vertical bisector of \( \overrightarrow{p_f{p}_{rc}} \) and \( \overrightarrow{p_f{p}_{rc}} \) is defined as the point pmid. In the \( \overrightarrow{p_f{p}_{ff}} \) direction, we draw \( \left\Vert \overrightarrow{p_f{p}_s}\right\Vert =\left\Vert \overrightarrow{p_2{p}_f}\right\Vert \) and obtain a point ps on the extension line of \( \overrightarrow{p_f{p}_{ff}} \) (the topological graph is shown in Figure S1 A of the supplementary materials). Then, p2 _ new is located as:
$$ {p}_{2\_ new}=\left({G}_{p_{mid}}\cdot {p}_{mid}+{G}_{p_2}\cdot {p}_2+{G}_{p_s}\cdot {p}_s\right)/\left({G}_{p_{mid}}+{G}_{p_2}+{G}_{p_s}\right) $$
(3)
Here, \( {G}_{p_{mid}} \) represents the signal strength of pmid, taking the grayscale value of pmid in the same way as for \( {G}_{p_2} \) and \( {G}_{p_s} \). With the starting point p2 _ new, we use the idea of the mean drift to search for the centerline points of the neuronal fiber from the images. The drift direction \( {\overset{\rightharpoonup }{\upsilon}}_2 \) in a spherical region Sp centered on p2 _ new is calculated as follows:
$$ {\overset{\rightharpoonup }{\upsilon}}_2=\sum \limits_{p_i\in {S}_p}{a}_i\cdot {b}_i\cdot \overrightarrow{p_i{p}_{2\_ new}}/\left\Vert \overrightarrow{p_i{p}_{2\_ new}}\right\Vert $$
(4)
where pi is the ith voxel in the spherical region Sp. \( {a}_{p_i} \) is the angle factor of pi, \( {a}_{{\mathrm{p}}_i}=\sin {\theta_{p_i}}^2 \), where \( {\theta}_{p_i} \) is the angle between the vectors \( \overrightarrow{p_f{p}_{rc}} \) and \( \overrightarrow{p_{2\_ new}{p}_i} \), and \( {b}_{p_i} \) is the signal strength factor of pi, \( {b}_{p_i}=\exp \left(-{\left({G}_{p_i}-\mu \right)}^2/2{\sigma}^2\right)-\exp \left(-{\left({G}_{p_{2\_ new}}-\mu \right)}^2/2{\sigma}^2\right) \), where , and \( \sigma =\left|\mu -{G}_{p_{2\_ new}}\right| \).
In the \( {\overset{\rightharpoonup }{\upsilon}}_2 \) direction, we search for the point
$$ {p}_3=\arg \max \left({\omega_1}^{\left\Vert \overrightarrow{p_i{p}_{2\_ new}}\right\Vert }{G}_{std}\sqrt{G_{p_i}/{G}_{std}}/{G}_{p_{2\_ new}}-{\omega}_2\cos \angle {p}_f{p}_{2\_ new}{p}_c\cos \angle {p}_f{p}_i{p}_c\right) $$
(5)
\( {G}_{p_i} \) is the signal strength at voxel pi, taking the grayscale value of pi. \( {G}_{p_{2\_ new}} \) is the signal strength of p2 _ new. This means that ω1 and ω2 are the parameters that can be adjusted according to the image quality. When the parameter ω1 is set to a larger value, this means that we regard the grayscale value as a large influence on the points. However, when ω2 is set to a larger value, this means that we regard the angle deviation of the points as having a greater influence.
Quantitative evaluation on the synthetic dataset
Since the goals of most automatic neuron tracing methods are different from ours, the semimanually traced neuron skeleton, which is regarded as the ground truth in most studies, is one of our targets that needs to be optimized. Therefore, we have to find a ground truth, rather than the semimanually traced results, to evaluate the performance of our methods. Here, we design a three-dimensional synthetic neuron fiber image to evaluate our methods (Fig. 2). First, we choose a helix curve (t∈(π,3π], x = 30 × sin(t) + 60, y = 30 × cos(t) + 60, z = 10 × t) as the centerline of the neuron fiber, which is the ground truth of the traced neuron skeleton. Then, in three-dimensional images, we use continuous voxels to represent this helix curve. Next, we choose the Gaussian function as the point spread function (PSF) of the imaging system and convolve it with the helix curve images. The result simulates the neuron fiber images. Based on these images, we semimanually trace the neuron fibers. Following a previous procedure [8, 11, 19], the semimanually traced neuron skeleton is completed interactively in Amira visualization and data analysis software. The skeleton is checked back-to-back by 3 people. The traced skeleton is regarded as the optimization target. Finally, we use our method to optimize the skeleton.
The noise of nearby neuron fibers and other fibers can affect the quality of the automatically traced neuron skeleton. To evaluate the performance of our method in such cases, we add Gaussian noise with different signal-to-noise ratio levels (Fig. 2c and d). For the case of nearby neurons, we set a straight line as representing another neuron whose distance from the helix curve changes (Fig. 2e and f).
In the research on automatic neuron tracing methods, there are various parameters used to evaluate the traced results from different perspectives [22, 25, 26]. Among these parameters, the significant spatial distance (SSD) is a parameter that can intuitively evaluate the distance between two neuron skeletons, and it is suitable for us to use to quantitatively evaluate the difference between the traced neuron skeleton and the ground truth. The SSD is defined as the average distance of the reciprocal minimal spatial distances between two neuron skeleton nodes. A larger SSD means that these two neuron skeletons are more different from each other. In Peng’s work [22], if the distance between two nodes is less than two pixels, it is difficult to determine the difference visually. As a result, when calculating the SSD, the minimum reciprocal distance is set to 2 pixels. However, in our experiment, the maximum reciprocal distance between the unoptimized skeleton and the ground truth is less than 2 voxels. Therefore, we set the minimum reciprocal distance to 1 voxel. In addition, we also provide the SSDs that the minimum reciprocal distances are set to 0 and 0.5 voxels.
Process for a whole-brain three-dimensional image dataset
We used image blocks from TDat [27] and converted the three-dimensional image dataset, which is 32,266 × 54,600 × 10,730 voxels (more than 30 TB) with a depth of 16 bit/voxel, to the TDat data format. TDat divides a three-dimensional image dataset into many small three-dimensional image blocks. We then split the reconstructed digital skeleton into many fragments according to the range of each TDat image block, and the skeleton fragments in each block were calculated in turn. Finally, the fragments were connected to become a complete digital skeleton. This process was slightly different from the workflow for the local neuronal fibers. In summary, for the locally distributed neuronal fibers, we split the digital skeletons according to the branch structure. However, for the long-projection neuronal fibers, we split the digital skeletons according to the range of the blocks.
Neuronal fiber radius estimation
In a real dataset, it is difficult to obtain the centerline of the neuron fiber images. Thus, we do not have a way to directly evaluate the difference between the traced neuron skeletons and the ground truth. However, according to the traced neuron skeleton, we can calculate the neuron morphology parameters to evaluate the differences in the sizes. Here, we choose the neuron fiber length and neuron fiber radius to evaluate the performance of our method.
The estimation process for the neuron fiber radius is as follows (see Fig. 5): Based on the topology of the digital skeleton, we calculate a cross-sectional plane of a skeleton point. Thereafter, with a series of image processing operations, this plane is used to segment the area of the neuronal fiber tubular structure from the cross-sectional image, and we estimate the neuronal fiber radius. Specifically, the skeleton point p3 and its parent point pf and child point pc can be approximately considered to be in the same circle with a very large radius. Moreover, the normal vector \( {\overset{\rightharpoonup }{\upsilon}}_3 \) of this circle at p3 can be regarded as parallel to \( \overrightarrow{p_f{p}_{rc}} \) (the topological graph is shown in the supplementary materials, Figure S1 C). Subsequently, the cross-sectional plane at p3 is defined by the normal vector \( {\overset{\rightharpoonup }{\upsilon}}_3 \), named Area3. With regard to Area3, we segment the neuronal fiber tubular structure with a series of image processing operations, including linear grayscale transformations, adaptive thresholding [28], open operations, and connected component operations on the cross-sectional image. Finally, we obtain the inner part of the cross-section of the neuronal fiber structure at p3, named Areat. Here, we estimate this segmented result with a circle and estimate the radius according to the formula of a circle \( r=\sqrt{S/\pi } \), where S = Areat.
Statistical analysis
The statistical significance analysis was performed using MATLAB, and statistical comparisons were performed using a two-tailed t-test. All measurements are listed in the form mean ± std. The confidence level was set to 0.05 (*p < 0.05, **p < 0.01, ***p < 0.0001).