Scalable robust graph and feature extraction for arbitrary vessel networks in large volumetric datasets

Background Recent advances in 3D imaging technologies provide novel insights to researchers and reveal finer and more detail of examined specimen, especially in the biomedical domain, but also impose huge challenges regarding scalability for automated analysis algorithms due to rapidly increasing dataset sizes. In particular, existing research towards automated vessel network analysis does not always consider memory requirements of proposed algorithms and often generates a large number of spurious branches for structures consisting of many voxels. Additionally, very often these algorithms have further restrictions such as the limitation to tree topologies or relying on the properties of specific image modalities. Results We propose a scalable iterative pipeline (in terms of computational cost, required main memory and robustness) that extracts an annotated abstract graph representation from the foreground segmentation of vessel networks of arbitrary topology and vessel shape. The novel iterative refinement process is controlled by a single, dimensionless, a-priori determinable parameter. Conclusions We are able to, for the first time, analyze the topology of volumes of roughly 1 TB on commodity hardware, using the proposed pipeline. We demonstrate improved robustness in terms of surface noise, vessel shape deviation and anisotropic resolution compared to the state of the art. An implementation of the presented pipeline is publicly available in version 5.1 of the volume rendering and processing engine Voreen.


I. INTRODUCTION
T HE study of vascular networks using modern 3D imaging techniques has become an increasingly popular topic of interest in biomedical research [1], [2], [3], [4], [5], [6], as 2D slice analysis is limited to a small subset of the available data and cannot capture the 3D structure of vessel networks [7].At the same time, analysis of 3D datasets by visual inspection is error prone [8].Thus, for quantifiable results or novel insights into data properties invisible to the human observer, automatic image processing techniques are required.In light of different imaging techniques, modalities and problem domains, the generation of a voxel-wise foreground segmentation, can be seen as a sensible intermediate step for the automatic processing of vascular network images (Figure 1).However, the next step of calculating topological, morphological or geometric features -a key requirement for the application in biomedical research and beyond -has not received sufficient attention from the research community [9], which is mostly focused on the segmentation domain [10], [11], [12].
In recent years, improvements in imaging technology and procedures are providing images of increasing resolution (in terms of the number of voxels per unit volume) and thus offer new chances for finer grained analysis, but also pose new challenges for image analysis algorithms.The analysis of fine structures down to the level of capillary networks promises great insights.At the same time it is desirable to analyze large networks at once in order to obtain as much topological information as possible and avoid artifacts at the boundary of the volume.Modern microscopy hardware generates a single volumetric image file of hundreds of GBs or even TBs (the same order of magnitude as hard drive capacity) per sample which existing vessel analysis approaches are not able to process.Due to rapid advances in imaging technology and data sizes, this problem extends from commodity hardware to specialized workstations.Consequently, for a vessel network analysis pipeline to be useful now and in the future, it has to be scalable both in terms of memory and runtime, and it should be invariant with regard to increases in image resolution.For application in a wide range of biomedical domains, a pipeline should further not rely on specific imaging conditions, datasetdependent parameters, a tubular vessel shape, isotropic image resolution or a specific network topology.
In this paper we present a pipeline engineered to fulfill these requirements while extracting the topology, centerlines and edge associated features from binary volumetric images.This is achieved by our main contribution, a novel iterative refinement approach and careful design of all pipeline stages.
In the remainder of this paper we will first discuss related work and will then describe the proposed pipeline conceptu-Fig.1: Typical pipeline for processing of 3D images of vascular networks.While the research community has mostly focused on the foreground segmentation task (e.g., [11]), this paper presents a widely applicable analysis based on (potentially very large) binary volumes as input.ally.Next, we will elaborate the aforementioned requirements before describing individual pipeline stages in more detail with special attention to these requirements.Finally, we will quantitatively evaluate different aspects of proposed pipeline with regard to the requirements and conclude our work.II.RELATED WORK Chen et al. [13] use the neighborhood-based voxel classification definitions of [14] for analysis of hepatic vasculature.They also discuss other alternative voxel-based skeleton extraction methods, but note that they are not suitable: Distance transformation-based methods [15] do not guarantee conntectedness of the extracted skeleton, while Voronoi-skeleton methods [16] are comparatively time consuming.Furthermore, Chen et al. [13] perform extensive analysis to denote a single voxel of the skeleton as a branch point.They extract a graph by following skeleton voxels in a tree search, breaking up cycles in the graph.They do not perform any operations to remove erroneous branches -likely because for the processed low resolution volumes the effect of noise is negligible.
Drechsler and Laura [17] extend [13] by decomposing the generated skeleton into segments, computing a number of properties for each segment and generating a labeled graph structure from the segments.The authors observe that their algorithm is very sensitive to noise in the foreground segmentation, but do not propose any means to reduce this effect.
In [18] Chen et al. present an alternative thinning-based skeletonization method for analysis of hepatic vasculature as part of the previously published pipeline [13].They use the voxel classification methods of [14], but present a different algorithm.As a preprocessing step they suggest filling-hole techniques and morphological operations to remove cavities and irregularities on the surface of the object.
Chothani et al. [19] propose a pipeline for tracing of neurites from light microscopy image stacks.It comprises foreground segmentation, skeletonization using a voxel coding algorithm [20], tree construction using the scalar field of the skeletonization step and refinement including pruning based on length as well as separation of branches that appear to be connected due to limited image resolution.The separation of branches in 3D should not be necessary using higher resolution images.The authors note that in this case due to the increased volume size, advances in processing capabilities are required.
Almasi et al. [21] present a method of extracting graphs of microvascular networks from fluorescence microscopy image stacks.In contrast to other methods operating on binary images they use imaging method specific information to improve the detection of vascular branches with disturbed image signal.
Cheng et al. [22] analyze connected 3D structures such as vessel systems and metal foams in binary volumes.They use topological thinning [23] and merge resulting branching points using an algorithm based on regional maximal spheres and maximal inscribed spheres with regard to the object surface.While their algorithm is linear in complexity, even for volumes of 10 9 voxels (a gigabyte assuming one byte per voxel) it requires hours of computation time.
In [24], the authors use a non-topology preserving distance field guided voxel thinning algorithm.They construct a graph Fig. 2: A schematic overview of the proposed pipeline.
from an intermediate representation based on voxel neighborhood, but do not describe an efficient implementation.The thinning step creates holes in irregularly shaped vessels which is used to detect arteriovenous malformations, but makes is unsuitable for irregularly shaped (e.g., lymphatic) vessels.
Rayburst sampling [25] extracts precise measurements of radius, volume and surface of spatially complex histopathologic structures (given a centerline representation of a vessel).The method operates on grey value images and detects the surface by sampling along rays originating from the measurement point.For each point a large, but configurable number of rays is emitted to estimate the local morphology.
Finally, as many vessel network analysis pipelines include a skeletonization step, the work of Bates et al. [26] is worth of note.The authors demonstrate binary vessel segmentation using convolutional recurrent neural networks, including a variant centerline extraction.However, when compared to traditional skeletonization algorithms, the complex decision process of ANNs has disadvantages: No hard guarantees of connectedness or the thickness of skeletal branches are available (without further processing).Additionally, this approach may have problems with high resolution datasets where the vessels are wider than the field of view of the network, thus making it impossible to determine the centerline position.
All of the above works either do not mention processing of very large data samples (and thus likely do not support it) or explicitly mention that large datasets pose a problem for their method.In this work we want to fill this gap by presenting a widely applicable pipeline that is engineered to handle very large input volumes and presented in the next section.

III. PIPELINE
The proposed pipeline comprises four stages which are evaluated iteratively (see Figure 2).In the first stage (Skeletonization) the foreground of the binary input volume is reduced to a voxel skeleton using a topological thinning-based algorithm, similar to [17].We use a modified version of [14] which can be implemented efficiently both in terms of computational complexity and disk access to the out-of-core dataset.Next, we build a graph representation from the voxel skeleton.In contrast to [17] this Topology Extraction stage is implemented as a single sweep over volume, engineered for memory locality resulting in fewer disc accesses.In the Feature Annotation stage, we compute a set of morphological and geometrical properties for all edges of the graph.This stage is subdivided into two steps.First, the Voxel-Branch-Assignment determines for each foreground voxel which branch it is associated to.The Feature Extraction step uses this mapping to efficiently calculate (geometrical [17] and additional morphological) features in single sweep over the volume.Then, in the Refinement stage, the graph is pruned using previously calculated features, using the scale invariant, dimensionless property bulge size.Since pruning invalidates the Voxel-Branch-Assignment, the first three stages of the pipeline have to be reevaluated.As shown in evaluation, this step is essential for application on high resolution images.This Extraction-Refinement cycle is repeated until a fixed point is reached.We want to stress that the novel iterative refinement approach is essential to obtain meaningful results from very large datasets, as demonstrated in Section VI.
IV. REQUIREMENTS FOR BROAD BIOMEDICAL APPLICATION As discussed before, the proposed pipeline, in contrast to previous work, is engineered to fulfill a set of requirements that are essential for a wide range of biomedical applications.For the discussion in the remainder of this paper, in this section we will structure, motivate and expand on the requirements briefly mentioned in the introduction.
Primary Requirements: The method should be scalable with regard to runtime (P1).As all voxels of a volume can be expected to be taken into account by the method, the runtime can be expected to be at least linear in the number of voxels n.In order to be applicable to increasingly large datasets, a runtime of O(n 2 ) would be unacceptable, but quasilinear algorithms (e.g., in O(n log n)) can still be executed.Beyond computational complexity, special care has to be taken for example in terms of memory access locality to avoid an increase in computation time by a large constant factor.
The method should be scalable with regard to memory requirements (P2).We can expect the input datasets to fit onto (commodity) non-volatile storage (disk), but not necessarily in main memory.While the price per capacity of both disks and main memory fall exponentially, the rate of (exponential) decrease is higher for disks than for memory [27], [28].Consequently, the ratio of average disk size to average RAM size grows over time.The exact relationship is not easy to establish as it changes over time [28].Here we assume that disk and volatile memory size roughly grow in a relationship of m to m 2 3 for a disk size of m.Therefore the required main memory size of the method should not exceed O(m 2 3 ).The method should exhibit invariance with regard to image resolution (P3).For a fixed size specimen, an increase in image resolution results in a dramatic increase in surface noise related artifacts for simple topological thinning-based methods which is unacceptable for analysis of large datasets.
Secondary Requirements: From the desire to apply the method in practice we derive a set of further requirements that are not directly related to scalability: The method should be unbiased (S1), i.e., not dependent on a set of parameters that are required to be selected carefully and correctly depending on the input image.
The method should be robust in terms of deviations from the tube shape (S2) of the analyzed network structure.While blood vessels are tubular, lymphatic vasculature as an example is highly irregular, but should still be processed correctly.
The method should handle volumes of anisotropic resolution (S3).Volumetric imaging techniques often create Fig. 3: A schematic overview of the data flow in the proposed pipeline.Dashed lines signify an optional dependency on the data for the particular stage.For skeletonization topological information is not available in the first iteration of the pipeline.datasets with voxel spacing that differs between coordinate axes.As an example, in light sheet microscopy datasets are constructed from a series of (2D) slices for which the spacing is independent from the x-y-resolution.Methods operating on the voxel grid of a volume have to consider this.
The method should be able to fully analyze networks of arbitrary topology (S4).Existing methods often assume a tree structure and either fail to operate on other topologies or drop information [13].However, for lymphatic vessels and capillaries, or even larger structures (cerebral arterial circle) the assumption of a tree topology is invalid.
The method should be able to analyze images independently of the imaging conditions (S5).Concrete imaging conditions (i.e., distribution of gray values or fluorescent staining techniques in microscopy imaging) vary between domains.In order to be widely applicable, the method should not be dependent on concrete models of imaging conditions.Our method achieves this by operating solely on binary input data and leveraging the broad range of other research on vessel image segmentation [10], [12].

V. METHOD DETAILS
In this section, we will present the algorithmic details of the proposed pipeline with special attention to the requirements for broad biomedical application.Figure 3 provides an overview of the data flow between individual stages of the pipeline.

A. Skeletonization
Like other vessel network analysis pipelines [13], [17], we use a modified version of the established topological thinning algorithm by Lee et al. [14] which has the advantage that any voxel can be evaluated in terms of these properties by only considering its 26-Neighborhood, which is advantageous when operating on very large volumes.Furthermore, we are not aware that other approaches solve the inherent problems of analysis of large datasets.In the original formulation, the algorithm iteratively removes voxels that lie on the surface and whose deletion does not change the topology of the object in 6 (static) subiterations until no more voxels can be removed.
For an efficient implementation, further considerations are necessary.We model the surface of the object explicitly as a sequence of voxel positions, which is initialized by scanning the volume before the first iteration step.In subsequent subiterations, we build the active surface (voxels that can potentially be deleted in the next iteration) by retaining voxels from the previous active surface that are not considered during the current subiteration and adding all foreground voxels in the 26-Neighborhood of a voxel after its deletion.If a voxel was considered, but not deleted during the current subiteration, it will be removed from the active surface even although it is still part of the surface of the object.If one of its neighbors is deleted, it will be re-added to the active surface and reconsidered for deletion in the following iteration.This implementation has a runtime of O(n) in the number of voxels in the volume (P1): In each subiteration, only voxels in the active surface are considered.As a voxel is either deleted entirely or removed from the active surface after each iteration (i.e., 6 subiterations) and only added again once one of its 26 neighbors is deleted, it will be considered for deletion a constant amount of times.In order to fulfill requirement P2, the volume is stored on disk and dynamically mapped to memory using operating system capabilities.Constant runtime improvements could be observed using a compressed (2 bits per voxel) representation and storing the volume as 32x32x32 (8 KB) blocks in linear memory, thus reducing disk access by exploiting the spatial locality of surface voxels.Active surfaces are stored as on disk (linear) voxel positions in sorted sequences.During a subiteration the previous active surface is read from front to back.Simultaneously, the new active surface is built slice-by-slice in memory by collecting positions, and sorting and removing duplicates before writing to disk, requiring O(m 2 3 ) main memory (P2).To account for volumes with anisotropic resolution, we keep track of the real-world depth of voxel layers deleted for each direction and select the direction with the lowest total depth as the direction for the next subiteration.This way, the speed of voxel deletion is equalized on average for highly anisotropic volumes (S3).If the last subiterations of all 6 directions (which may have happened out of order) did not delete any voxels, the active surface is empty and the skeletonization is completed.

B. Topology Extraction
Previous work [13], [24] does not describe an efficient implementation of topology extraction for large input datasets.For example, Chen et al. [13] perform a tree search following skeleton voxels through the volume.This is unsuitable for large volumes as it either requires keeping the complete volume in memory or involves very frequent random access to hard disk with high latency.Additionally, their method breaks loops in the graph and always creates a tree topology.
In constrast, the topology extraction in the proposed pipeline extracts the complete centerline graph in a single pass over the volume using a modified version of the streaming connected component finding algorithm by Isenburg and Shewchuk [29].Instead of all foreground components, we consider the three skeleton voxels classes (regular (2 neighbors), end (< 2 neighbors), and branch (> 2 neighbors) [13]) separately and extract components for each.Individual connected components of end or branch voxels make up nodes in the graph.Connected components of regular voxels form lines within the volume and have (except for the separately handled case of closed Fig. 4: If the centerline of a small vessel (blue) is closer to the surface of a larger vessel (yellow) than its own centerline, some voxels of the larger vessel will be assigned to the smaller vessel (a).After the initial Voronoi Mapping, connected foreground components of the same ID are remapped using IDs from seed points of branches (b).Unlabeled regions are identified (c) and flood-filled from labeled voxels (d).
loops) exactly two end points.Unlike [13], we do not force the position of nodes to be defined by a single voxel and instead use the barycenter of the connected region of voxels.Using the two end points of a segment of regular voxels we can find the nodes that are connected by the path of regular voxels and thus construct an edge.To perform this node-edge-matching process efficiently for large graphs, for each node voxel (i.e., an end or branch voxel) we insert a reference to the original node into a (static, on-disk) k-d tree.Thus, for each run of regular voxels we can efficiently query the position of node voxels that are 26-neighbors of its tips, and thus link node and edge data structures.Extracted nodes and edges (including centerlines) form the Proto-Vesselgraph, are each stored in files on the disk and are dynamically mapped to memory.Furthermore, the algorithm of [29] (and this modification) can be shown to only require O(m 3 ) main memory (P2).Moreover, we extract the whole graph in a single pass over the volume in O(n log n) time (P1) and do not make any assumptions about the topology of the extracted network (S4).
Since the individual branch voxels necessarily lie on voxel positions of the original volume, they resemble a ragged centerline.This both artificially increases the line length and produces a larger number of ambiguous cases in the Voxel-Branch-Assignment.We therefore smooth all centerlines of the Proto-Vesselgraph using local bezier curves.

C. Voxel-Branch Assignment
Drechsler and Laura [17] calculate the volume property of edges by assigning voxels of the foreground segmentation to the nearest centerline point.As illustrated in Figure 4a, this creates incorrect results in some cases.While these errors are potentially not as severe for the calculated volume, morphological properties based on the radius of the vessel can be heavily affected.In order to correctly calculate edge-associated properties for the previously created Proto-Vesselgraph structure, we therefore first create a mapping between voxels of the foreground segmentation and the edges of the Proto-Vesselgraph.This process is performed in 4 steps that are described below and illustrated in Figure 4.
1) Centerline Voronoi Mapping: As a first approximation of the edge ID map, we find the nearest centerline point for each voxel of the foreground segmentation.We accelerate the search using a static on-disk k-d tree of all centerline points of all edges of the Proto-Vesselgraph, each annotated with the corresponding edge ID.As we iterate over all n foreground voxels and perform a log n search in the k-d tree for each, the total runtime is within O(n log n), as is the construction of the k-d tree (P1).We iterate over the whole volume in 32x32x32 blocks of voxels for spatial locality of lookups, thus reducing the need to randomly reload parts of the k-d tree from disk.
2) Connected Component Remapping: As illustrated in Figure 4a, matching voxels via minimal euclidean distance does not yield a proper voxel wise vessel segment map in all cases.When two vessels with highly dissimilar radii lie close to each other, some voxels of the larger vessel will be ascribed to the smaller vessel.These cut off regions have to be identified and remapped.We perform a modified connected component analysis [29] on the generated edge ID volume in which we consider two voxels mergable if they have the same ID.Then we map the connected component IDs back to the edge IDs using a table constructed by sampling at one centerline point of each edge in the result of the connected component analysis.
3) Cut-off Region Identification: Cut-off regions do not have an associated edge ID and can therefore be identified using another pass of a connected component analysis [29] where only voxels with undefined IDs are considered.In a single pass over the whole volume, we collect the axis aligned bounding boxes for each cut-off region (see Figure 4c).
4) Cut-off Region Flooding: Previously identified cut-off regions are relabeled by flooding all voxels without edge IDs starting from adjacent voxels with valid edge IDs.For the simple case of a region cut-off from a single vessel section, this fills the region with the section's edge ID.In a more complicated case of a cut-off region adjacent to multiple different vessel sections, the resulting labeling approximates the L1-Voronoi-cells spanned from the surface of adjacent regions.A side effect of this procedure is that foreground regions that do not resemble (complete) vessels at the boundary of the volume (and thus do not contain any centerline voxels), are not flooded with valid IDs in this step and ignored in subsequent steps.
For efficiency, all identified disconnected regions are processed separately as cuboidal subvolumes and collected during a single pass over the ID-volume.Currently valid bounding boxes of cut-off regions are organized in interval trees.For the z-axis, a single tree references all bounding boxes.For each slice, the active regions can be queried to construct an interval tree for the y-axis.For each line in the slice, using the y-axis tree an interval tree for the x-axis can be constructed.For each voxel in the line, all active regions can be queried from the x-axis tree so that the current voxel value can be written to the corresponding subvolumes.After this collection step, each subvolume is flooded separately.Similar to the skeletonization step, we use the concepts of an active surface, slice-wise processing and memory mapping of files on disk to guarantee linear runtime (P1) and O(m 2 3 ) memory requirement (P2).The results are written back to the ID volume in a process similar to the collection of the subvolumes.The number of disconnected regions is bounded by n.Consequently, the construction of the range trees and the querying for each voxel are within O(n log n) (P1).All subvolumes and range trees can be stored on disk and mapped dynamically to memory and therefore do not require additional memory (P2).

D. Feature Extraction
Using the Proto-Vesselgraph and the generated edge ID volume, we can compute the same edge properties as Drechsler and Laura [17].Some geometric features can be computed directly from the Proto-Vesselgraph.These include length (arc-length of the centerline), distance (euclidean distance of connected nodes) and straightness ( distance length ).Drechsler and Laura [17] propose curveness ( length distance ), but as the distance of any vessel segment is guaranteed to be smaller than its arclength, we argue that straightness (with a guaranteed domain of For other features we collect information from the edge ID volume and the Proto-Vesselgraph.For each foreground voxel in the edge ID volume, we find the corresponding edge and query the closest of its centerline point(s).For efficient lookups, the centerline points are organized in a (static, ondisk) k-d tree for each edge.We equally distribute and store the volume occupied by the foreground voxels alongside the corresponding centerline point(s).Surface voxels are assigned to the closest centerline point, for which thus the minimum, maximum and average surface distance is computed.
Values collected for individual centerline points can be aggregated to compute more per-edge features: For each of the surface distance-based values (minimum, maximum, average, roundness := minimum maximum ) we compute the mean and standard deviation, totalling in 8 additional morphological features not discussed in [17].Compared to [25] this provides a relatively coarse understanding of the local vessel morphology, but is a good approximation that can be computed efficiently.Finally, the per-voxel volume can be accumulated to compute the total volume occupied by each vessel segment in the vessel graph.The average cross-section of a vessel is obtained by dividing the volume by the length [17].
In total, the number of centerline points is obviously bounded by n so that the cost of building and querying the k-d trees is within O(n log n).Additionally, the label volume is accessed in 32x32x32 chunks improving memory locality of searches in the k-d trees (P1).All components of the Proto-Vesselgraph, the k-d trees and the created graph are stored on disk so that main memory requirements are met (P2).

E. Refinement
Skeletonization algorithms tend to produce a number of small spurious branches, especially for vessels with irregular surfaces.We therefore propose to refine the generated vessel graph by pruning spurious branches.A simple, but scaledependent way of defining deletability is to set a global minimum length [19].However, this is problematic if vessels of different scales are present in a single dataset.Instead, we propose bulge size as a scale-independent dimensionless measure (S1).Intuitively, the bulge size measures how far a bump, bulge or branch has to extend from a parent vessel in order to be considered a separate vessel.This size is expressed relative to the radius of its parent vessel and itself, making it scale-independent.More formally, the bulge size is an edge feature, that is computed during the feature extraction V-D, and is only defined for bulging edges, i.e., edges that connect a leaf node (degree 1) and a branching point (degree > 2).
For all centerline points of the edge, we decide whether they are within a branching region (inner points) or not (outer points) during the feature extraction.If a point has associated surface points, neither of which are adjacent to surface voxels of another edge, it is condidered an outer point.An inner point is characterized by either not having any associated surface voxels or by having a surface voxel that is a 26-neighbor of a point that belongs to another edge.The inner length of a node is the arc length defined by a run of centerline points starting from that node and ending at the last inner voxel.This corresponds to the length of the piece of centerline that lies within a branching area.The tip radius is defined as the minimum distance to the surface measured from the centerline point closest to the node.Without loss of generality, let n b and n e be the branching point node and leaf node, respectively.This definition provides a dimensionless measure of shape that performs well even for corner cases and can be computed efficiently during the feature extraction stage of the pipeline.The calculation and applicability is illustrated in Figure 5.The pruning of spurious branches is a node-oriented, iterative approach outlined in Algorithm 1.A pruning step iterates over all nodes in the graph and collects deletable branches.All deletable edges and nodes that were only connected to those edges are then removed from the graph.Additionally, two edges that share a common node of degree 2 are merged by connecting their centerlines and recomputing properties from the attributes of points in the combined centerlines.The pruning step is repeated until a fixed point is reached.
While the refined graph does not retain any signs of deleted edges, the simultaneously computed centerlines and edge properties are not preserved by the refinement procedure.This happens because the volume regions associated with now deleted branches do not contribute to the properties of the if bulging(e) ∧ bulge size(e) < t then return G 21: end function remaining branches until the ID map and features are recomputed.While this affects the bulge size, the pruning scheme will not erroneously delete branches, since the missing regions (lowered radii) cause the bulge size to be overrestimated.

F. Iterative Scheme
In order to ensure that centerlines and edge properties match the refined graph, we employ an iterative scheme where the previously extracted and refined graph is fed back into the first stage of the pipeline to improve the generated results.
We modify the skeletonization algorithm to force it to generate a voxel skeleton with a topology that matches the refined graph of the previous iteration.All voxels of nodes with degree 1 in the graph are set to be fixed foreground during the skeletonization of the input volume.This means that they are never considered for deletion.These voxels mark the beginning/end points of lines to be extracted during the skeletonization.The skeletonization algorithm is modified to not preserve voxels at the end of voxel lines (non-line voxels [14]).The resulting voxel skeleton connects all previously extracted endvoxels with medially positioned lines.
The iteration can be stopped if it reaches a fixed point.This is the case if two consecutive iterations generate the same graph.As the number of edges never increases, the check can be reduced to a simple integer comparison.Optionally, an upper limit for the number of iterations can be specified.

VI. EVALUATION AND DISCUSSION
In this section we evaluate and discuss the proposed pipeline in terms of the primary and secondary requirements for biomedical application (Section IV).All experiments were conducted on a consumer grade PC (AMD Ryzen 7 2700X For both strategies, the foreground (grey pixels) still form a plausible foreground segmentation of a vessel network.

A. Runtime Scalability
The demonstration of scalability requires some way of modifying a scale parameter for a given dataset without changing other characteristics.One possibility would be to use an already large dataset, scaling it down or clipping it to a smaller region.However, besides the problem of availablility of a very large volume (representing the results of future microscopes), this could cause artifacts and loss of detail due to downsampling, making the comparison of generated graphs very difficult.Instead we use a small (real world) volume and artificially increase its size using two strategies (see Figure 6): The resample strategy: Resampling the volume with larger resolution, using nearest-neighbor sampling to avoid changing topology near very thin connections in the original volume.
The mirror strategy: Repeating the volume in each dimension until the desired (integer) scale is achieved.In order to generate a mostly connected network, in each dimension the 2i + 1'th volumes are mirrored.
In a real world scenario, improved acquisition technology would result in a combination of both increasing (voxel) size of previously visible vessels and revealing previously undetected finer networks of larger complexity.When not specified otherwise, the dataset Lymphatic 1 (Figure 17a, 135×160×213 voxels) was used for evaluation, as it has a non-tree-like topology, anisotropic voxel resolution (32 µm × 32 µm × 16 µm) and a irregular vessel shape.In the following, scale denotes the factor that each dimension of the dataset was multiplied with using one of the above strategies.
Figure 7 shows the average runtime of a single refinement iteration of the pipeline depending on the number of voxels for the mirror and resample strategies in a log-log plot.As shown, the runtime is only slightly worse than linear, as should be expected in the light of the derived runtime complexity of O(n log n).Furthermore, it should be noted that the increase in slope around 10 11 coincides with the exhaustion of main memory and thus a larger number of disk accesses (cf. Figure 11).
Figure 8 once again shows the average iteration runtime, but using a linear scale and with details of individual steps of the pipeline.It can be observed that the runtime of all steps increases roughly linearly with respect to the number of voxels.Here, it becomes apparent that for the resample strategy, an iteration appears to take slightly more time than with the mirror strategy.This appears to be mostly due to more time spent in the Voxel-Branch-Assignment and Feature Extraction steps -likely because larger vessels result in more time spent in searching the k-d-trees in both steps.The total runtime of the pipeline, however, is not only dependent on a single refinement iteration, but also on the number of iterations required to compute the final result.The number of iterations required to reach a fixed point for an increasing number of voxels is depicted in Figure 9 for both scaling strategies.While the total number of iterations using the resample strategy can expected to be constant for scale 4 and higher (since scale 8 corresponds to 8 copies of the dataset at scale 4), the number of iterations remains at 4 independent of the scale in this experiment.While we do not derive Num. of Refinement Iter.

Scale Mirror
Fig. 9: The total number of iterations required to reach a fixed point for the resample and mirror strategies.a maximum for the number of iterations for the resample strategy, we can observe the required iterations to increase only moderately with the number of voxels in the experiment.While we acknowledge that the (total) processing time of longer than one week for a 1 Terabyte dataset is certainly far from interactive use, this does not pose a problem in biomedical research since the time required for the preparation, image aquisition and postprocessing of a sample is of the same magnitude, and the presented pipeline is to the best of our knowledge the first to make analysis of these samples possible.Furthermore, as our pipeline can be considered unbiased (see Section VI-D), interactive parameter tuning is not required.
Figure 10 shows that the total number of nodes in the graph rapidly decreases even for higher scales.The final number of nodes for all scales is very close to scale 1, but increases slightly for higher iterations.Section VI-C and Figure 12 demonstrate that the resulting graphs are actually similar.Thus, if constant runtime (for different datasets of a given scale) is required, a relatively low maximum number of iterations can be specified to obtain a good approximation of running until a fixed point is reached.At the same time, Figure 10 shows that a single refinement step is not sufficient to obtain meaningful results for very large datasets, as a large number of spurious branches remain.This underlines the necessity of the presented iterative refinement approach.

B. Main Memory Scalability
In the description of the pipeline we have argued for all steps of the pipeline to allocate O(m 2 3 ) of memory.For that reason and because total allocated heap memory is difficult to measure efficiently, we demonstrate how the implementation of the presented pipeline behaves with regard to the resident set size (RSS), which specifies the portion of the memory map of a process that is actually held in main memory.This excludes allocated memory that has been swapped out to disk, but includes (sections of) files that have been copied to memory.Figure 11 shows the maximum RSS of the process performing the graph extraction for different problem sizes (number of voxels in the volume).It can be observed that the RSS increases for higher problem sizes, but does not exceed a limit near the total available main memory of the test machine (32GiB).This demonstrates the aforementioned property of the pipeline to be scalable in terms of memory, but to use all of the available memory resources.Notably, the mirror strategy reaches the 32GiB limit earlier than the resample strategy.This may be due to the fact that by repeating the graph m-times in each dimension, the total number of centerline points in the graph (and thus the required memory to store it) increases roughly by a factor of m 3 , while for the resample strategy, the number of centerline points is (roughly) multiplied by m.

C. Image Resolution Invariance
As a qualitative evaluation, Figure 12 demonstrates the effect of increased image resolution on current skeletonization/graph extraction approaches and how the iterative refinement approach solves this problem.For a 16-fold resolution increase in each of the coordinate axes (using the resample strategy), the number of branches and nodes increases sharply (Figure 12b) compared to the the final graph extracted from the original volume (60-fold increase for the depicted example, Figure 12a).After 5 refinement iterations, the number of branches is reduced (Figure 12c) to a number comparable to the simple case (scale 1, Figure 12a).As the increase in resolution allows for finer grained skeletonization, the graphs still differ in some details, but most of the structure is the same.This is confirmed by biomedical domain experts who consider the refined version appropriate for further analysis, but reject the graph without refinement.
Figure 13 shows how an increase in resolution (using the resample strategy) affects the extracted graph.Since no ground truth graph is available for the dataset, we compare the all available graphs with the graph extracted using the lowest and highest scale, respectively.When using the graph of scale 1 as a template, the edge match ratio drops relatively sharply for increasing, but smaller scales.For higher scales the difference in similarity is not as pronounced.An explanation could be that for low scales (and thus low image resolutions), Fig. 13: A similarity comparison between graphs of different scales using the edge match ratio [30] and the graphs extracted from the lowest and highest scale volume as template graph.
skeletonization artifacts more heavily influence the final result.Small bulges with a bulge size near the specified threshold cannot be treated with as much precision as necessary and are thus deleted although they should not be and vice versa.This hypothesis is supported by the fact that when choosing the graph extracted from the highest scale volume as the template for comparison, the graphs for most scales exhibit a much higher similarity.
As an additional way of evaluating the image resolution invariance (P3), we want to focus on one of the primary problem of increased image resolution: The increase in surface noise on the individual voxel level.For this we generate 10 datasets using Vascusynth [31], a tool for generating volumetric images of vascular trees as well as the corresponding ground truth segmentations and tree hierarchy used here.Before graph extraction we perturb the surface of the binary volumes by iteratively selecting a random foreground or background surface voxel and flipping its value if this does not change the topology of the object.The surface noise level is defined as #voxel value changes #total number of surface voxels .An example of a volume with applied surface noise is presented in Figure 14b.For each of the volumes, four variants with different random noise seeds were evaluated.As shown in Figure 15, the edge match ratio [30], i.e., the ratio of edges that can successfully be matched between the extracted and the ground truth graph, rapidly decreases when not using iterative refinement while the refinement approach presented in this paper manages to reproduce the expected graph even for high noise levels with Edge Match Ratio no refinement iterative refinement Fig. 15: The edge match ratio [30] between the ground truth graph and the graph extracted from the corresponding volume [31] with and without iterative refinement.Prior to the graph extraction salt-and-pepper noise is added to the surface (x-axis).For each noise level, minimum, maximum and average of 10 datasets for 4 surface noise seeds are shown.
only slight decreases in similarity.Note that we chose a bulge size of 1.5 for the iterative refinement, which is unusually low for blood vasculature.However, Vascusynth generates some branches that are very short compared to the radius parent vessel, and in some cases are entirely enclosed, which are only visible as very shallow bumps (or not visible at all) in the generated volume (see highlighted region in Figure 17a as an example).For real world applications this is likely negligible.Instead, higher robustness is likely preferable, necessitating higher bulge size, as discussed in the following section.

D. Influence of the bulge size Parameter
As explained in Section V-E, the parameter of the proposed method can be selected a-priori based on the application and the desired outcome of the pipeline.The quality of the result does therefore not directly depend on the parameter.Hence, we consider our method unbiased (S1).Nevertheless, in Figure 16 we present an overview of the influence of different bulge size values on the result of the pipeline using a test dataset with low-profile, but variable-depth bulges.As can be seen, a very low bulge size results in separate branches even for very small surface features.Increasing the parameter gradually reduces the complexity of the extracted topology.For real world applications, the bulge size should be set a-priori based on knowledge of the dataset.For example, lymphatic vessels (especially in the pathological case, illustrated in Figure 17b, but even in healthy individuals, see Figure 17a) often have comparatively short branches, but also suffer from irregular surface features, often near branching points.We therefore suggest a bulge size of 1.5, where the edge case corresponds to nearly spherical, but slightly elongated bulges.On the other hand, (healthy) blood vasculature often exhibits a smooth surface, round diameter, and clearly defined branches, so that a bulge size of 3.0 or higher can be chosen confidently.

E. Anisotropic Resolution
The third secondary requirement states that a pipeline should be able to operate on volumes with anisotropic resolution (S3).In order to evaluate our proposed pipeline in Fig. 17: Some exemplary real world datasets used for evaluation rendered as the surface of the foreground.It is apparent that the surface and topology of the real world, lymphatic vessel datasets are much more complex than that of the synthetic blood vessel datasets (see Figure 14).
terms of that property, we use a volume with known ground truth and isotropic voxel spacing.We then artificially create datasets with different, anisotropic resolutions by resampling the dataset in specific dimensions.As an example, confocal microscopes due to their anisotropic point spread function generate volumes with a comparatively low z-resolution, which is why in this experiment we chose to leave the z-axis resolution intact and increase the resolution in x-and y-direction (akin to the resample strategy).In total, we evaluated 10 datasets with resolution scale differences of 1, 2, 4, 8, and 16.We compared the extracted topology using the graph matching method and edge match ratio similarity measure proposed in [30] and the quality of the extracted centerline geometry using the NetMets framework [32].In the case of NetMets the average radius of all edges multiplied by 10 was chosen as the parameter σ.
The results are shown in Figure 18.Although the ground truth is not matched perfectly and some drift between scales is noticeable (see discussion about ground truth quality above), in general no trend towards positive or negative impact on the evaluated score can be observed based on the level of anisotropy.Furthermore, it should be noted that the dynamic axis selection discussed in Section V-A indeed improves the result for larger anisotropy, although for smaller levels there is next to no difference between both variants.We therefore conclude that the proposed pipeline is in fact able to process data of even highly anisotropic resolution.

F. Robustness against shape deviations
For the evaluation of robustness of the pipeline, especially with regard to deviations from the tube (S2) shape of analyzed vasculature, we use the robustness test introduced in [30].Additionally, the FNR and FPR values (false negative/positive ratio) of [32] were used to measure the geometric error using in the framework of [30].The (real world) lymphatic and synthetic blood vessel datasets were rotated and resampled (with doubled resolution in each axis) in 10°steps around the z-axis, processed using the proposed pipeline, compared to the template graph.The minimum similarity of all steps denotes the robustness index.For the synthetic datasets the ground truth graph provided by the tool was used as a template and  18: A demonstration of how anisotropy affects the topological (Edge Match Ratio [30]) and geometrical (Centerline [32]) structure of the extracted graph.We compare the ground truth graph of a volume generated by VascuSynth [31] to the graph extracted using the proposed pipeline after increasing the x-and y-resolution by a specified factor (but leaving z-axis resolution unchanged) and resampling the volume.Minimum, average and maximum for 10 datasets are shown.
As can be seen, anisotropic resolution does not strongly affect the pipeline and the dynamic axis selection is advantageous.
thus are also measuring the accuracy of the method.For real world datasets, generation of an annotated ground truth graph is virtually impossible [30] and thus only the robustness can be evaluated.We compare the results without refinement to the graphs extracted using a bulge size of 1.5 and iteration until reaching a fixed point.The aggregated results in Table I show that for all properties the refinement procedure results in significantly higher robustness for all datasets.As should be expected, the total robustness of the much more complex real world lymphatic vessel datasets is overall lower than for the (simpler) synthetic datasets.However, it is noteworthy that even when applied to complex real world datasets, the proposed pipeline achieves a higher score than the simple, but commonly applied approach without refinement on simple, synthetic datasets.

G. Remaining Secondary Requirements
The fourth secondary requirement (the ability to process volumes of arbitrary topology) directly follows from how the topology is extracted from the binary skeletonized volume (Section V-B) and, in particular, is not restricted to a tree topology, in contrast to other methods [17].Furthermore, the pipeline is independent of the imaging conditions (S5) by operating on an existing foreground segmentation.

H. Limitations
The proposed pipeline -like all topological thinning-based methods -by definition suffers from distortions in the binary input image that causes changes to the topology, i.e., cavities in foreground objects and loops on the boundary.While these features are not expected in the actual vessel structure, imaging artifacts, noise or problems with the segmentation procedure can still produce such artifacts in practice.However, careful post-processing of the segmentation can mitigate these effects: Cavities in foreground objects can reliably be removed using a modified variant of [29] operating on the background without labeling, but removing objects smaller than a specified size.The threshold can typically be chosen very liberally, e.g., as a percentage of the volume diagonal.Removal of loops on the surface is more difficult, but in our experience a (binary) median filter performs well.The filter size should be chosen to not disturb the smallest vessels in the image, but still be able to cover surface loops in the image.

VII. CONCLUSION
Analyzing ultra large datasets by scalable algorithms is still a huge challenge.We have presented a pipeline for extracting the topology and various features of vessel networks from large three-dimensional images.We were able to show that our method fulfills all requirements identified in Section IV.Our main contribution, a novel iterative refinement approach and careful engineering of all pipeline stages, allowed us to demonstrate the scalability and thus applicability to very large datasets, e.g., generated by modern microscopes (primary requirements).At the same time, our pipeline can be applied to a wide range of problem domains due to its robustness, unbiased nature, and lack of assumptions about the topology and morphology of the analyzed vessel systems (secondary requirements).
We will now apply the proposed pipeline and continue previous work [4] using previously unachieveable level of detail, which hopefully leads to new biomedical discoveries.
In the future, we would like to explore how even more specific image features (c.f.[25]) can be integrated into the pipeline without compromising its scalability.Additionally, the current version of the pipeline is entirely single-threaded.While it is possible to more efficiently use the available resources of modern hardware by simultaneously processing multiple datasets in separate processes, it would be desirable to accelerate a single run using multiple processor cores or even GPUs.However, this is a challenging task: While approaches to parallel skeletonization algorithms exist, they pose problems with regard to guarantees about the mediality of the centerline and reproducibility in general.Furthermore, at least in the current formulation, the connected-componentanalysis [29], which is used in variations in several places in this paper is inherently sequential.Offloading work to the GPU requires even more attention to detail with regard to memory management.Furthermore, we would like to explore automatic segmentation approaches for vessel structures in large volumetric datasets, with special focus on (irregular) lymphatic vessel systems, to facilitate the usability of the presented pipeline even further.[30] using 36 rotations around the z-axis for each respective dataset.G property denotes the GERoMe index for the given property.N F N R and N F P R use the same perturbation procedure, but measure the geometric error [32] of the calculated centerlines and therefore show the aggregated maximum value.Refined graphs were extracted using a bulge size of 1.5 and iteration until reaching a fixed point.Relative score differences above 10% are highlighted as bold text.
The presented pipeline is part of version 5.1 of Voreen [33] a widely-used ( [34], open-source volume processing and rendering framework.All scripts used in the evaluation are publicly available online 2 to facilitate reproducing the presented results.

17 :
Delete edges in D and remove orphaned nodes 18: Remove nodes with degree 2 and merge edges 19: until D = {} Iterate until reaching fixed point 20:

Fig. 6 :
Fig.6: An exemplary demonstration of the (b) mirror (c) strategies for increasing volume size on a (2D) dataset (a) to scale 2, i.e., doubling the volume size in each dimension.For both strategies, the foreground (grey pixels) still form a plausible foreground segmentation of a vessel network.

Fig. 7 :Fig. 8 :
Fig. 7: Average iteration runtime of the pipeline for the resample and mirror strategies in a log-log plot.The functions c•n log n and c•n are shown as visual guides.c was chosen so that both guides match the mirror plot at the first data point.

Fig. 10 :
Fig.10: The total number of nodes in the graph after different numbers of iterations for the various scales.

Fig. 11 :
Fig. 11: Maximum resident set size (RSS) of the graph extraction process for different problem sizes (number of voxels in the volume) for the resample and mirror strategies.
Fig.12:A demonstration of the effect of increased image resolution on the intermediate result and how the refinement iterations mitigate this effect, using dataset Lymphatic 1: An increased image resolution highly increases the number of erroneous branches (in this example the total number of branches is increased roughly 60-fold).After 5 refinement iterations the resulting graph is very similar to the graph extracted from the original volume without artificially increased resolution.

Fig. 14 :
Fig.14: The foreground of one of the synthetic vessel datasets used for evaluation without (a) and with maximal noise (b) applied.In (a) a very small bump which is categorized as a separate branch according to the ground truth is highlighted.

Fig. 16 :
Fig. 16: A demonstration on how different parameter values affect the result of the graph extraction pipeline.In the example, very low values allow even very minor bumps (such as the cow's teats in the model) to be considered separate branches.Increasing values gradually remove further branches such as the belly, tail, and features of the head until finally for a very large value the graph consists of a single edge.

TABLE I :
Results of the robustness test