Volume 13 Supplement 8
Highlights of the 1st IEEE Symposium on Biological Data Visualization (BioVis 2011)
NetMets: software for quantifying and visualizing errors in biological network segmentation
 David Mayerich^{1}Email author,
 Chris Bjornsson^{2},
 Jonathan Taylor^{2} and
 Badrinath Roysam^{3}
DOI: 10.1186/1471210513S8S7
© Mayerich et al.; licensee BioMed Central Ltd. 2012
Published: 18 May 2012
Abstract
One of the major goals in biomedical image processing is accurate segmentation of networks embedded in volumetric data sets. Biological networks are composed of a meshwork of thin filaments that span large volumes of tissue. Examples of these structures include neurons and microvasculature, which can take the form of both hierarchical trees and fully connected networks, depending on the imaging modality and resolution. Network function depends on both the geometric structure and connectivity. Therefore, there is considerable demand for algorithms that segment biological networks embedded in threedimensional data. While a large number of tracking and segmentation algorithms have been published, most of these do not generalize well across data sets. One of the major reasons for the lack of generalpurpose algorithms is the limited availability of metrics that can be used to quantitatively compare their effectiveness against a preconstructed groundtruth. In this paper, we propose a robust metric for measuring and visualizing the differences between network models. Our algorithm takes into account both geometry and connectivity to measure network similarity. These metrics are then mapped back onto an explicit model for visualization.
Introduction
Threedimensional biomedical data sets often contain complex anatomical structures that are difficult to segment and reconstruct. Of particular interest are filament networks embedded in volumetric data. Examples of these include vascular and neuronal networks. With increased use of highthroughput imaging, there has been significant interest in fast and accurate segmentation algorithms for large data sets. However, segmentation of filament networks in microscopy data sets continues to be a difficult problem. While there has been an effort to distribute tracking algorithms both commercially and as open source through software packages such as the Farsight Toolkit http://www.farsighttoolkit.org, most algorithms are optimized for specific data sets and imaging modalities. In fact, funding initiatives like the DIADEM Challenge [1] have been designed to motivate researchers to create generalized segmentation algorithms that work across several data sets.
Previous work
There is an extensive body of published work on filament segmentation in biomedical data sets. A review of vessel extraction techniques for vascular trees is given by Kirbas and Quek [2] and a review of neuronal tracing methods is available by Donohue and Ascoli [3]. Semiautomated techniques are often used in commercial software and recent methods have been proposed to significantly speed up skeletonization based on user input [4]. Thinning algorithms are often used for segmentation of highcontrast images [5–7]. While these generally rely on local or global thresholds, recent methods have been proposed that are threshold independent [8] and more robust to surface perturbations [9]. Filament tracking techniques have been developed for lowcontrast data, such as neurons in confocal data sets, and rely on the intensity gradient across the filament crosssection [10–12]. Several techniques have also been proposed for making segmentation more robust. These include multihypothesis tracking [13] and postprocessing [14, 15].
While new and improved segmentation techniques are proposed yearly, there are few methods for quantitatively comparing results to an established ground truth. The DIADEM metric [16, 17] and Path2Path [18] are the only quantitative technique that we are aware of specifically designed to compare neurons. However, we also consider other metrics and measurements that could be applied to this problem. In the following sections, we describe some alternatives for evaluating differences in explicit networks. We place a particular focus on metrics previously developed for validation in neuronal and vascular tree segmentation. These potential metrics are organized into two groups: geometric methods and topological methods.
Geometric methods
The most basic approach for comparing an explicit representation is the use of standard geometry metrics, such as those used for validation in surface segmentation and surface simplification (levelofdetail). An overview of these methods as they are applied to geometric models is provided by Luebke [19]. While these metrics are designed to evaluate twodimensional surfaces, applying them to interconnected networks of onedimensional centerlines is straightforward. Two of the most common examples of these geometric techniques are the mean squared error (MSE) and the Housdorff distance. In addition, the Path2Path algorithm [18] was recently proposed as a geometric approach specifically designed for comparing similarity between neurons.
We first consider the mean squared error, which is computed by averaging the square of minimum distances from one geometric model A to another geometric model B. In the ideal case A = B and therefore MSE(A, B) = 0. One disadvantage of this measurement is that it is not commutative, meaning that generally MSE(A, B) ≠ MSE(B, A). The Hausdorff distance addresses this issue by computing a mutual metric that is the maximum of minimum distances between A and B. This metric is commutative, however the metric weighting is based solely on the largest distance between A and B. This can easily result in ignoring more relevant errors that occur near the ground truth. This also makes the Hausdorff distance, and to a lesser degree the MSE, sensitive to spurious geometric components that may appear some distance from the ground truth model. Finally, both of these metrics are insensitive to errors in connectivity, which can exist independently to the network shape.
The Path2Path metric was proposed as a method for comparing the geometric characteristics of a neuron in order to facilitate queries into online neuronal databases. This metric forgoes the standard graph representation of a neuronal tree in favor of a collection of geometric paths that extend from the root node to the end of each neuronal process.
The neurons are then compared by determining the amount of energy required to optimally morph the set of paths in A to match the set of paths in B. Since the connectivity is discarded, Path2Path is essentially a geometric measure. However, since many paths in the resulting set overlap, errors are difficult to spatially localize.
The geometry metric proposed in this paper is based on a similar principle to the MSE. The lack of commutativity is addressed by computing a bidirectional measurement, which is incorporated into the geometric FPR and FNR. The distance sensitivity is eliminated by scaling the error between A and B by an inverse nonlinear Gaussian function that approaches unity at a relatively short distance from the model. In addition, this allows for errors to be spatially localized and the resulting metric can be visualized within a narrow constant range of 0[1].
Topological methods
Recent methods proposed for validating treelike segmentations of vasculature and neurons are based on topological approaches. These methods leverage the hierarchical structure of the model in order to quantify segmentation error. Unlike geometric approaches, these methods account for connectivity and can also incorporate some basic geometric information. The two methods that we address are the constrained Tree Edit Distance (TED) [20] and the DIADEM Metric [16].
The constrained TED provides a metric that identifies the number of edits that must be performed on a given model A in order for it to topologically match a second model B. These edits take the form of node insertions and deletions. This method has been proposed as a technique for quantifying the difference between two neuronal models [21]. However, the TED is not geometrically specific, since the metric depends only on branch position within the hierarchy relative to a root node and is independent of the spatial position and shape of structures in the tree.
The DIADEM Metric incorporates geometric characteristics of the two models in order to better map branches in the test case model to corresponding geometry in the ground truth. Branch points and end points, for example, are mapped between the models A and B using a proximity query. This constraint makes the topological analysis significantly more efficient while allowing a direct mapping between fibers (edges) in A and B. In addition, the path length for each fiber can then be directly compared in order to modify the metric based on geometric deviations of the fibers from the ground truth.
The connectivity metric that we propose relies on mapping of branch points and end points between the ground truth and test case, which is similar to the initial approach taken by the DIADEM metric. We then use a graph traversal method to evaluate connectivity locally [22], which allows our algorithm to compare nonhierarchical and interconnected networks. By removing the dependence on a hierarchical network model, our algorithm is also robust to connectivity errors that include gaps in the network. In addition, the NetMets software allows the error to be localized and visualized on a rendered image of the ground truth model (Figure 2b). Finally, we demonstrate that our algorithm can detect the deformation of a single fiber in the test case (Figure 2c). Since the topology of the network is still correct, this error is undetected using the constrained TED. Since the fiber length is maintained, this geometric error is also undetectible using the DIADEM metric. These errors are easily located and visualized by identifying mapped edges with a high geometric error in NetMets.
Proposed methods
The method that we propose compares both the geometry and connectivity of two interconnected networks. Based on a single parameter σ, defining the sensitivity of the metric, our algorithm returns four normalized values characterizing the degree of similarity between two input networks. These metrics are then mapped onto the original input models so that differences between the networks can be visualized. In all cases presented here, the parameter σ is set to the mean fiber radius, however other values can be used. Higher values of σ result in a decrease in the detected error (FNR and FPR).
In the following sections, we describe the input to our proposed algorithm and define the terms used to process network models.
Input models
The most common format for storing traced neurons is the SWC file. SWC files are supported by popular network simulation programs, including NEURON [23] and NETMORPH [24]. In addition, SWC is the most common format found in online neuron libraries used for connectomics research, including visualization [25] and simulation [26, 27]. An SWC file represents neurons as a sequence of 3D points, along with their parent point. All points extend from the root node of the neuron, which is generally the cell body. While the SWC file is sufficient for representing trees, closed loops cannot be formed. Our algorithm supports treelike structures loaded using the SWC format as well as treelike and closedloop models represented using the Alias/Wavefront OBJ file format.
Terminology
When describing connectivity operations, we use the following terminology:

A node is a junction where multiple fibers connect, or where a single fiber terminates.

An edge is a filament that links two nodes.

A point is a threedimensional position that lies on the network skeleton.
Note that a fiber can consist of multiple points that describe its geometric shape. While these points are used to evaluate the network geometry, a fiber is represented topologically by a single edge (Figure 1). In addition, we recognize two primary connectivity errors:

Gaps in fibers.

Excess edges forming loops or spurs.
Note that a spur connected to correctly segmented geometry is identified as an error even though it is not strictly a topological change.
Overview
The metric proposed in this paper provides false positive and false negative rates for network geometry and connectivity. The proposed geometry metric integrates a weighted distance function along all curves in a network. We show that this can be evaluated efficiently in $O\left(\frac{L}{\sigma}\text{log}\frac{L}{\sigma}\right)$ time, where L is the length of all fibers in the network and σ is a sensitivity parameter.
We measure connectivity differences by using geometric information to map between nodes and edges in both networks. We then find a set of edges and nodes common to the ground truth and test case. Excess features are then used to quantify the connectivity differences between the two networks.
Geometry
In this section, we first identify the fundamental problem with applying the standard geometry metrics described previously to network segmentations. We then describe the method used by NetMets to address these issues. Error metrics such as MSE and the Hausdorff distance provide a global measure of model similarity, which is ideal when constructing a mesh based on a source model. However, these techniques are not robust for fibrous models, and often apply excessive penalties for relatively small errors. Consider a test case T that perfectly matches the ground truth GT for a neuron, except for a small length of fiber some distance ε from the cell. When measuring the mean L1distance from T to GT, the networks will appear identical. However, computing the Hausdorff distance will result in a value of ≈ ε. Measuring the mean L 1distance from GT to T also provides a result of ≈ ε where L is the length of the spurious segment. In both cases, changing the distance ε significantly affects the error, even though the distance of the spurious fiber is likely irrelevant to improving the segmentation algorithm used. A more intuitive metric would scale some constant value by the length L of the detected segment. However, global application of a distance threshold minimizes the impact of errors that occur close to the network, such as oscillations and gaps in fibers. Given two networks N_{1} and N_{2}, our proposed algorithm estimates the ratio of the length of fiber in N_{1} that has no correspondence in N_{2} to the total fiber length in N_{1}. This estimate is computed by placing an implicit Gaussian envelope around N_{2} and integrating along the set of curves representing fibers in N_{1}. In order to quantify both missed fibers and false positives, we perform a bidirectional measurement, comparing N_{1} to N_{2} as well as comparing N_{2} to N_{1}.
In the following sections, we describe common geometric errors encountered in network segmentation. We then describe the theory behind our proposed measurement as well as implementation details and methods for improving accuracy.
Common geometry errors
Errors in segmentation consist of both undetected and spurious fibers as well as deformations in fibers. Thinning algorithms [5] are sensitive to variations in the fiber surface, resulting in spurs that are not present in the groundtruth. This becomes more prominent when highfrequency surface features are present, such as dendritic spines in images of neurons. Since many segmentation algorithms require thresholding as a preprocessing step, noise and other artifacts can create spurious loops or gaps in the geometry.
Tracking methods [10, 12] are more robust to many of these errors, however they rely on seed points for fiber detection. Incorrect placement of seed points can cause entire fibers to be missed. In addition, tracking algorithms are more stable, often causing false fibers to be significantly longer. This makes them more difficult to locate and remove using postprocessing. Finally, variations in the image can cause segmented fibers to oscillate and deviate from the corresponding fibers in the ground truth.
Geometry metric
where N_{1}∩N_{2} is the set of fibers common to both networks and integration refers to the length of fiber in the corresponding set. Since this metric does not consider the topological structure of each network, it may be helpful to think of N_{1} and N_{2} as the sets of all points that lie on the curves representing the corresponding network. However, it is impractical to evaluate the statement N_{1}∩N_{2} for an explicit model since it is extremely unlikely that fibers in N_{1} and N_{2} will precisely overlap.
where d(x, N_{2}) is the distance between x and the closest point in N_{2} and σ is a sensitivity parameter. Intuitively, this is the equivalent of placing a Gaussian envelope around N_{2} and weighting the points in N_{1} based on their position within this field. The value σ is the standard deviation of this Gaussian envelope.
where G_{ FNR } is the false negative rate, G_{ FPR } is the false positive rate, and N_{ GT } and N_{ T } are the groundtruth and testcases respectively.
Note that Equation 3 can be determined for subsets of the network. The metric value for a single point is used for visualization while integration along a single fiber is used to determine weights for the connectivity metric.
Implementation
M is evaluated by determining a set of points that lie on each network and resolving the distance function using a nearestneighbor search. The explicit models are resampled at intervals of at most εσ, where ε is a parameter describing the degree of accuracy of the geometry measurement. Resampling is performed using linear interpolation, since this is the standard for representing neuronal models. However higherorder interpolants can be used by resampling the network model as a preprocessing step.
The distance function d(x, N_{2}) is evaluated using a nearestneighbor search. The sample points for N_{2} are stored in a kdtree [28, 29] and successive queries for all x ∈ N_{1} are used to determine the distance to the closest point on N_{2}. This distance is then used to evaluate the geometry metric (Equation 3).
Accuracy
The accuracy of the geometry metric is of particular interest since the distance function d(x, N) (Equation 3) is determined using a nearestneighbor search, and therefore dependent on the spacing between sample points. In addition, the distance field is scaled by a nonlinear Gaussian function. While this reduces the potential error at a distance, the error for small values of d is weighted more heavily. We show that the error in M can be tightly bound using a gridbased sampling technique to determine points on N_{1} and N_{2}.
for all ε ≤ 1. This upper error bound occurs at a distance of $\frac{\epsilon \sigma}{2}$ (Figure 3) and the resulting error in M decreases as d(x, N) → 0. By using a value of $\epsilon =\frac{1}{10}$, the largest error ${E}_{M}<\frac{1}{1000}$.
Connectivity
Because of this complexity, we inform our connectivity metric using key pieces of geometric information. In particular, our algorithm creates a mapping between detected nodes in the testcase to those in the groundtruth. In the ideal case where all nodes in the ground truth are detected, comparing connectivity is a trivial matter of finding edges that are present in both the groundtruth and testcase. However, this is insufficient when nodes are present in only one network. In particular, undetected or falsely detected nodes cause edges to become subdivided or result in topological changes (Figure 4). This removes the onetoone correspondence of edges between the groundtruth and testcase. Our proposed metric estimates the number of undetected and falsely detected edges and vertices. In addition, we create a mapping between edge sequences in both networks, allowing interactive visualization of detected paths between nodes.
Common connectivity errors
Common connectivity errors include additional edges and gaps. In the case of thinning algorithms, these edges are often due to highfrequency noise which form loops or spines on existing fibers. Many segmentation methods produce small gaps in fibers, forming multiple discontinuous segments in place of a single continuous fiber. These errors are difficult to detect geometrically, since disconnected points can occupy the same spatial position in the geometric model. The purpose of the proposed connectivity metric is to give equal weight to graph edges, independent of the length of the associated fibers.
Connectivity metric
where FN is the number of edges in the groundtruth that are not represented in the test case, FP is the number of edges in the testcase that do not exist in the groundtruth, and TP is the total number of correctly detected edges.
Graph initialization
Each node is initialized with a threedimensional coordinate from the explicit model. Each node v ∈ V_{ T } ∪ V_{ GT } is then assigned a color value based on the geometric positions of nodes in the groundtruth. The vertex color is a unique identifier that links nodes in both graphs that correspond to the same geometric feature in the original data set. A negative color value indicates that a node exists in one graph, but not in the other. To clarify, nodes that exist in both graphs are assigned a color value C(v) ≥ 0 and are referred to as colored nodes. Nodes that exist in only one graph are uncolored and have a color value of C(v) = 1.
The color of each vertex is a unique identifier indicating the nearest node in G_{ GT }. A negative value is assigned if there are no nodes within a distance of σ. This color value provides the basis for comparing connectivity between the two networks.
where e is the corresponding fiber length and M(e) is the geometry metric for the associated fiber. The value for M(e) is determined by integrating the geometry metric along a single fiber.
Core connectivity
We define the core connectivity of a graph G = {V, E} as the graph G_{ c } = {V_{ c }, E_{ c }}, where V_{ c } includes all nodes v ∈ V where C(v) ≥ 0 and E_{ c } represents paths between elements of V_{ c } consisting of only uncolored nodes. Alternatively, given a graph G with colored nodes, we place an edge e = (v_{ i }, v_{ j }) in G_{ c } if there is a corresponding path in G between ${\widehat{v}}_{i}$ and ${\widehat{v}}_{j}$, where $C\left({v}_{i}\right)=C\left({\widehat{v}}_{i}\right)$ and $C\left({v}_{j}\right)=C\left({\widehat{v}}_{j}\right)$, that contains only uncolored nodes.
Comparison
Implementation
Graph coloring is performed using a nearestneighbor search, similar to the method described in Section . This search requires a maximum O(N log N) time, where N is the number of nodes in the largest graph. Computing the local neighborhood is incorporated into the minimum path algorithm by ending the current search iteration when a node with C(v) ≥ 0 is discovered. Therefore, no paths passing through a colored node are considered.
The shortestpath algorithm has a time complexity of $O\left(\widehat{E}+\widehat{V}\text{log}\widehat{V}\right)$[32], where $\widehat{E}$ and $\widehat{V}$ are the number of edges and nodes in the local neighborhood. This search must be computed for every node in a graph, resulting in a time complexity of $O\left(N\widehat{V}\text{log}\widehat{V}\right)$. Biological networks are believed to be scalefree [33], therefore the number of edges per node is expected to be small. A poor segmentation can produce a network with a significant number of nodes, resulting in a timeconsuming analysis, therefore this complexity must be considered. In practice, however, the local neighborhood tends to be small. For these cases, $\widehat{V}\text{log}\widehat{V}$ can be considered constant.
Visualization
The geometry and connectivity metrics proposed in this paper provide a global measure for comparing interconnected networks. However, one of the principle advantages of the proposed algorithm is the ability to localize geometric and connectivity errors. If properly visualized, this can allow developers to quickly identify cases where segmentation algorithms fail and provide insight into improving algorithms. In this section, we describe techniques that we have employed to visualize the differences between networks.
Several methods have been proposed for visualizing fiber structures, particularly in the field of diffusion tensor MRI. The most common methods use streamlines and stream tubes [34]. Methods for visualizing networks by applying orientation filters [35] have been proposed. In addition, selective visualization of volumetric data [36] has been used to render networks with similar structure to those described in this paper. The rendering methods that we use are inspired by recent techniques for rendering highdimensional functions using threedimensional connected lattices [37].
Color mapping
Divergent color mapping (Figure 8d) provides an intuitive ordering from cool to warm colors and can be used with shading, making it useful for surfacemapping on threedimensional models. In addition, recent work by Borkin et al. [39] has shown that diverging color schemes significantly improve the interpretation of scalar data on tubelike structures when compared to rainbow color mapping. The NetMets software supports all four color mapping methods, with the bluered divergent color scheme as the default.
Geometry
We visualize geometric differences by mapping the geometry metric directly onto the explicit representation of the original models. The global geometry metric described earlier is evaluated by integrating along the curves representing fibers in the network model N_{1}. The resulting value provides a very general measure of the average distance between N_{1} and another network N_{2}. For visualization, the value at each point on N_{1} is used to highlight the specific differences in geometry between N_{1} and N_{2}.
Connectivity
The concept of network connectivity is significantly more abstract, since a onetoone correspondence between fibers in N_{1} and N_{2} often does not exist. For example, fibers that are subdivided (Figure 4) can result in a single fiber in N_{1} being mapped to multiple fibers in N_{2}. This can also result in multiple fibers in N_{1} overlapping when mapped to N_{2}. In the case of spurious or undetected fibers, a mapping between N_{1} and N_{2} does not exist.
We use several methods to visualize errors in connectivity. First of all, undetected or spurious nodes are rendered as red spheres, while detected nodes are gray. This simple strategy is used to visualize regions where connectivity errors are frequently made. Where a mapping exists between edges in the ground truth and test case, corresponding edges are colorcoded. This is shown for both hierarchical (Figure 9) and interconnected (Figure 10) proxy data. The mapping is based on common edges found in the core connectivity graphs for each network (Figure 6). Edges that did not exist in the core connectivity graph, or were later removed, are rendered in gray.
Results
In this section, we demonstrate how the NetMets software can be used to compare explicit interconnected networks in several cases relevant to current research needs. We first show how NetMets can be used evaluate the performance of an automated segmentation algorithm on a data set distributed as part of the DIADEM Challenge. We then evaluate the performance of the same algorithm on fluorescence microscopy data. Next, we show that NetMets can be useful for comparing different manual tracings of the same network structure. Finally, we demonstrate how the bidirectional measurement used by our proposed metric algorithm can be useful in evaluating segmentation effectiveness when only an incomplete ground truth is available.
Evaluating segmentation algorithms
One of the primary motivations for this work is to provide a quantitative method for evaluating the performance of segmentation algorithms as well as an intuitive visualization approach for identifying where segmentation errors arise in the data. We show how NetMets is suited for this task by performing automated segmentation of two data sets. The first data set is a brightfield microscopy image of a series of cerebellar climbing fibers. This data set is distributed through the DIADEM Challenge [40] (Dataset 1) and is available online http://www.diademchallenge.org. The model used as the ground truth was created for the DIADEM Challenge and is distributed with the data. The second data set is mouse brain tissue imaged using a confocal microscope. The data set contains a network of astrocytes in the neighborhood of a blood vessel. The ground truth was manually constructed using Neuromantic http://www.reading.ac.uk/neuromantic/.
Automated segmentation was performed using ridge detection, followed by dilation and topologypreserving thinning [5] to produce an implicit skeleton. Voxels composing the skeleton were then explicitly connected and smoothed to perform the final model. While superior algorithms have been described [41–45], a basic algorithm is useful in this case since our goal is to demonstrate how errors can be quantified and localized.
Segmentation of the cerebellar climbing fibers data set produces a reasonable model of the prominent geometric features (Figure 11). The results of the metric indicate that approximately 18% of the ground truth geometry is missed by the segmentation while approximately 72% of the ground truth connections are undetected. Close inspection shows that missed regions correspond to short fibers (Figure 11d) or fibers with a close proximity to detected fibers (Figure 11f). The large number for falsenegative connectivity corresponds to the significant number of small fibers missed by the automated algorithm (Figure 11h). Since this data set is directly supported by the DIADEM Metric (D = 1) and the resulting model is treelike, we can evaluate the DIADEM Metric as M = 0.315. For our comparison, we use a standard deviation of σ = 10 pixels. Note the additional information provided by NetMets and the ability to visualize the errors on the network models.
Comparing manual segmentations
While a significant amount of current research in the area of neuronal segmentation is directed toward fullyautomated reconstruction, building an accurate groundtruth can also be a difficult problem. This is particularly true for extremely dense and complex data sets, where manual segmentation is a timeconsuming process that introduces fatigue in the experts producing the desired model. Recent work by Helmstaedter et al. [46] demonstrate a method for overcoming this problem by using multiple experts to trace a single data set. The models for each expert can then be used to produce a more accurate groundtruth.
Subgraph comparison
With the development of new highthroughput imaging methods, the size and complexity of data sets can make it impractical to construct a complete ground truth. One possible solution to this problem is to manually label small subsets of the raw data. However, thorough validation of a large data set would require manual labeling of several small subsets to provide a statistically viable sample size. This often makes manual tracing more complex by introducing artificial fiber terminations at the boundaries of these subsets. Since current tools like Neuromantic allow semiautomated tracing, it is often easier to manually trace long fibers than to start and terminate several small ones. It would therefore be convenient to create a ground truth that represents a subset of complete fibers in the data set. However, this would cause properly segmented fibers to be incorrectly labeled as falsepositives when there is no corresponding segmentation in the groundtruth. One of the advantages of using a bidirectional measurement like the one we have proposed is that this case can be, to some extent, recognized and corrected.
Conclusion and future work
In this paper, we propose robust methods for quantifying and visualizing differences in interconnected fiber networks. This work is motivated by the need to validate segmentation algorithms for interconnected networks in biomedical imaging. As biomedical data sets increase in size and complexity, qualitative comparison has become insufficient to address this issue. The techniques that we propose build on the quantification principles introduced by the DIADEM Challenge [16] and serve as a basis for building visualization tools that extend both quantitative and qualitative validation to research on largescale biological networks.
Current advances in highthroughput imaging are motivating research into robust and generalized segmentation algorithms, which are particularly useful in the field of connectomics [26]. Opportunities exist for finding common segmentation errors across a large network. In particular, correlating connectivity and geometric errors with features in the original data set could help train segmentation algorithms. In addition, our algorithm requires a qualified groundtruth in order to perform the comparison, which is difficult to create for large data sets. We demonstrate that subsets of the ground truth can be used to estimate the effectiveness of a segmentation algorithm, however this is based on culling higherror fibers from the test case. This can result in the exclusion of fibers that are inaccurately segmented, resulting in overestimation of the algorithm's performance. This is something that may be addressed with more complex culling algorithms based on other fiber features such as length and connectivity.
In addition, more accurate edge mapping between the ground truth and test cases would be useful for visualization, since one of the more common errors we have found in our tracing examples are fibers that terminate early, putting them out of range of the corresponding ground truth end node. While this is taken into account in the geometry metric, it can provide for confusing visualization when exploring the connectivity graph.
Finally, previous methods such as the DIADEM Metric provide advantages that may improve the effectiveness of our proposed algorithms. In particular, the use of fiber length as a geometric measurement can capture errors that are not recognized by our algorithm, such as erroneous fibers that are in close proximity to actual geometry. The NetMets software is available online as open source at http://www.davidmayerich.net/software.
List of abbreviations used
 MSE:

Mean Squared Error
 DIADEM:

Digital Reconstruction of Axonal and Dendritic Morphology
 FPR:

False Positive Rate
 FNR:

False Negative Rate
 TED:

Tree Edit Distance
 FN:

False Negative
 FP:

False Positive
 KESM:

KnifeEdge Scanning Microscopy.
Declarations
Acknowledgements
This work was funded in part by the Beckman Institute for Advanced Science and Technology and the Center for Biotechnology and Interdisciplinary Studies.
This article has been published as part of BMC Bioinformatics Volume 13 Supplement 8, 2012: Highlights of the 1st IEEE Symposium on Biological Data Visualization (BioVis 2011). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S8.
Authors’ Affiliations
References
 Mann A: Teams battle for neuron prize. Nature 2010, 467(7312):143. 10.1038/467143aView ArticlePubMedGoogle Scholar
 Kirbas C, Quek F: A review of vessel extraction techniques and algorithms. ACM Computing Surveys 2004, 36: 81–121. 10.1145/1031120.1031121View ArticleGoogle Scholar
 Donohue D, Ascoli G: Automated reconstruction of neuronal morphology: an overview. Brain research reviews 2010, 67(1–2):94–102.PubMed CentralView ArticlePubMedGoogle Scholar
 Abeysinghe SS, Ju T: Interactive skeletonization of intensity volumes. The Visual Computer 2009, 25(5–7):627–635. 10.1007/s0037100903255View ArticleGoogle Scholar
 He X, Kischell E, Rioult M, Holmes TJ: ThreeDimensional Thinning Algorithm that Peels the Outmost Layer with Application to Neuron Tracing. Journal of ComputerAssisted Microscopy 1998, 10(3):123–135. 10.1023/A:1023437403657View ArticleGoogle Scholar
 Niki N, Kawata Y, Satoh H, Kumazaki T: 3D imaging of blood vessels using Xray rotational angiographic system. Journal of ComputerAssisted Microscopy 1993, 3: 1873–1877.Google Scholar
 Tozaki T, Kawata Y, Niki N, Ohmatsu H, Eguchi K, Moriyama N: Threedimensional analysis of lung areas using thin slice [CT] images. Journal of ComputerAssisted Microscopy 1996, 2709: 1–11.Google Scholar
 Ju T, Warren J, Eichele G, Thaller C, Chiu W, Carson J: A geometric database for gene expression data. Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing 2003, 166–176.Google Scholar
 Liu L, Chambers EW, Letscher D, Ju T: A simple and robust thinning algorithm on cell complexes. Computer Graphics Forum 2010, 29(7):2253–2260. 10.1111/j.14678659.2010.01814.xView ArticleGoogle Scholar
 AlKofahi K, Lasek S, Szarowski D, Pace C, Nagy G, Turner J, Roysam B: Rapid Automated ThreeDimensional Tracing of Neurons From Confocal Image Stacks. IEEE Transactions on Information Technology in Biomedicine 2002, 6: 171–186. 10.1109/TITB.2002.1006304View ArticlePubMedGoogle Scholar
 AlKofahi Y, Lassoued W, Lee W, Roysam B: Improved Automatic Detection and Segmentation of Cell Nuclei in Histopahtology Images. IEEE Transactions on Biomedical Engineering (in press) 2010.Google Scholar
 Mayerich D, Keyser J: Hardware Accelerated Segmentation of Complex Volumetric Filament Networks. IEEE Transactions on Visualization and Computer Graphics 2009, 15(4):670–681.View ArticlePubMedGoogle Scholar
 Friman O, Hindennach M, Kühnel C, Peitgen H: Multiple hypothesis template tracking of small 3D vessel structures. Medical Image Analysis 2010, 14(2):160–171. 10.1016/j.media.2009.12.003View ArticlePubMedGoogle Scholar
 Mayerich D, Kwon J, Choe Y, Abbott L, Keyser J: Constructing High Resolution Microvascular Models. Third Workshop on Microscopic Image Analysis with Applications in Biology 2008.Google Scholar
 Kaufhold J, Tsai PS, Blinder P, Kleinfeld D: Threshold relaxation is an effective means to connect gaps in 3D images of complex microvascular networks. Third Workshop on Microscopic Image Analysis with Applications in Biology 2008.Google Scholar
 Gillette TA, Brown KM, Ascoli GA: The DIADEM Metric: Comparing Multiple Reconstructions of the Same Neuron. Neuroinformatics 2011. [http://www.springerlink.com/content/n77666586741q811/]Google Scholar
 Brown KM, Barrionuevo G, Canty AJ, Paola V, Hirsch JA, Jefferis GSXE, Lu J, Snippe M, Sugihara I, Ascoli GA: The DIADEM Data Sets: Representative Light Microscopy Images of Neuronal Morphology to Advance Automation of Digital Reconstructions. Neuroinformatics 2011. [http://www.springerlink.com/content/47q03g2r25405n24/]Google Scholar
 Basu S, Condron B, Acton S: Path2Path: Hierarchical PathBased Analysis for Neuron Matching. IEEE Symposium on Biomedical Imaging 2011, 996–999.Google Scholar
 Luebke DP: Level of detail for 3D graphics. Morgan Kaufmann; 2003.Google Scholar
 Zhang K: A constrained edit distance between unordered labeled trees. Algorithmica 1996, 15(3):205–222. 10.1007/BF01975866View ArticleGoogle Scholar
 Heumann H, Wittum G: The TreeEditDistance, a Measure for Quantifying Neuronal Morphology. Neuroinformatics 2009, 7(3):179–190. 10.1007/s1202100990514View ArticlePubMedGoogle Scholar
 Mayerich D, Bjornsson C, Taylor J, Roysam B: Metrics for Comparing Explicit Representations of Interconnected Biological Networks. IEEE Symposium on Biological Data Visualization 2011, 79–86.Google Scholar
 Carnevale NT, Hines ML: The NEURON Book. 1st edition. Cambridge University Press; 2009.Google Scholar
 Koene RA, Tijms B, Hees P, Postma F, Ridder A, Ramakers GJA, Pelt J, Ooyen A: NETMORPH: A Framework for the Stochastic Generation of Large Scale Neuronal Networks With Realistic Neuron Morphologies. Neuroinformatics 2009, 7(3):195–210. 10.1007/s1202100990523View ArticlePubMedGoogle Scholar
 Lasserre S, Hernando J, Hill S, Schuermann F, Miguel Anasagasti P, Jaoud E GA, Markram H: A Neuron Membrane Mesh Representation for Visualization of Electrophysiological Simulations. IEEE Transactions on Visualization and Computer Graphics 2011. [PMID: 21383404] [http://www.ncbi.nlm.nih.gov/pubmed/21383404] [PMID: 21383404]Google Scholar
 Sporns O, Tononi G, Kötter R: The Human Connectome: A Structural Description of the Human Brain. PLoS Computational Biology 2005., 1(4):
 Markram H: The blue brain project. Nature Reviews Neuroscience 2006, 7(2):153–159. 10.1038/nrn1848View ArticlePubMedGoogle Scholar
 Friedman JH, Bentley JL, Finkel RA: An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS) 1977, 3(3):209–226. 10.1145/355744.355745View ArticleGoogle Scholar
 Bentley JL: Kd trees for semidynamic point sets. Proceedings of the sixth annual symposium on Computational geometry 1990, 187–197.View ArticleGoogle Scholar
 Cook SA: The complexity of theoremproving procedures. Proceedings of the third annual ACM symposium on Theory of computing 1971, 151–158.Google Scholar
 Dijkstra EW: A note on two problems in connexion with graphs. Numerische Mathematik 1959, 1: 269–271. 10.1007/BF01386390View ArticleGoogle Scholar
 Fredman M, Tarjan R: Fibonacci Heaps And Their Uses In Improved Network Optimization Algorithms. Foundations of Computer Science, Annual IEEE Symposium on, Volume 0, Los Alamitos, CA, USA: IEEE Computer Society 1984, 338–346.Google Scholar
 Barabási A, Albert R: Emergence of Scaling in Random Networks. Science 1999, 286(5439):509–512. 10.1126/science.286.5439.509View ArticlePubMedGoogle Scholar
 Stoll C, Gumhold S, Seidel H: Visualization with stylized line primitives. Proceedings of IEEE Visualization 2005, 695–702.Google Scholar
 Melek Z, Mayerich DM, Yuksel C, Keyser J: Visualization of Fibrous and Threadlike Data. IEEE Transactions on Visualization and Computer Graphics 2006, 12: 1165–1172.View ArticlePubMedGoogle Scholar
 Mayerich D, Keyser J: Visualization of Cellular and Microvascular Relationships. IEEE Transactions on Visualization and Computer Graphics 2008, 14(6):1611–1618.View ArticlePubMedGoogle Scholar
 Gerber S, Bremer P, Pascucci V, Whitaker R: Visual exploration of high dimensional scalar functions. Visualization and Computer Graphics, IEEE Transactions on 2010, 16(6):1271–1280.View ArticleGoogle Scholar
 Borland D, Taylor II R: Rainbow color map (still) considered harmful. IEEE computer graphics and applications 2007, 14–17.Google Scholar
 Borkin M, Gajos K, Peters A, Mitsouras D, Melchionna S, Rybicki F, Feldman C, Pfister H: Evaluation of Artery Visualizations for Heart Disease Diagnosis. Visualization and Computer Graphics, IEEE Transactions on 2011, 17(12):2479–2488.View ArticleGoogle Scholar
 Gillette TA, Brown KM, Svoboda K, Liu Y, Ascoli GA: DIADEMchallenge.Org: A Compendium of Resources Fostering the Continuous Development of Automated Neuronal Reconstruction. Neuroinformatics 2011. [http://www.springerlink.com/content/2903453q0w74553w/]Google Scholar
 Wang Y, Narayanaswamy A, Tsai C, Roysam B: A broadly applicable 3D neuron tracing method based on opencurve snake. Neuroinformatics 2011, 9(2–3):193–217. [PMID: 21399937] [PMID: 21399937] 10.1007/s1202101191105View ArticlePubMedGoogle Scholar
 Chothani P, Mehta V, Stepanyants A: Automated Tracing of Neurites from Light Microscopy Stacks of Images. Neuroinformatics 2011, 9(2–3):263–278. 10.1007/s1202101191212PubMed CentralView ArticlePubMedGoogle Scholar
 Zhao T, Xie J, Amat F, Clack N, Ahammad P, Peng H, Long F, Myers E: Automated reconstruction of neuronal morphology based on local geometrical and global structural models. Neuroinformatics 2011, 9(2–3):247–261. [PMID: 21547564] [PMID: 21547564] 10.1007/s1202101191203PubMed CentralView ArticlePubMedGoogle Scholar
 Turetken E, González G, Blum C, Fua P: Automated Reconstruction of Dendritic and Axonal Trees by Global Optimization with Geometric Priors. Neuroinformatics 2011, 9(2–3):279–302. 10.1007/s1202101191221View ArticlePubMedGoogle Scholar
 Bas E, Erdogmus D: Principal curves as skeletons of tubular objects: locally characterizing the structures of axons. Neuroinformatics 2011, 9(2–3):181–191. [PMID: 21336847] [PMID: 21336847] 10.1007/s1202101191052View ArticlePubMedGoogle Scholar
 Helmstaedter M, Briggman KL, Denk W: Highaccuracy neurite reconstruction for highthroughput neuroanatomy. Nat Neurosci 2011, 14(8):1081–1088. 10.1038/nn.2868View ArticlePubMedGoogle Scholar
 Mayerich D, Kwon J, Sung C, Abbott L, Keyser J, Choe Y: Fast macroscale transmission imaging of microvascular networks using KESM. Biomedical Optics Express 2011, 2(10):2888–2896. 10.1364/BOE.2.002888PubMed CentralView ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.