Statistical analysis of dendritic spine distributions in rat hippocampal cultures
 Aruna Jammalamadaka^{1}Email author,
 Sourav Banerjee^{2},
 Bangalore S Manjunath^{1} and
 Kenneth S Kosik^{2}Email author
DOI: 10.1186/1471210514287
© Jammalamadaka et al.; licensee BioMed Central Ltd. 2013
Received: 7 February 2013
Accepted: 16 September 2013
Published: 2 October 2013
Abstract
Background
Dendritic spines serve as key computational structures in brain plasticity. Much remains to be learned about their spatial and temporal distribution among neurons. Our aim in this study was to perform exploratory analyses based on the population distributions of dendritic spines with regard to their morphological characteristics and period of growth in dissociated hippocampal neurons. We fit a loglinear model to the contingency table of spine features such as spine type and distance from the soma to first determine which features were important in modeling the spines, as well as the relationships between such features. A multinomial logistic regression was then used to predict the spine types using the features suggested by the loglinear model, along with neighboring spine information. Finally, an important variant of Ripley’s Kfunction applicable to linear networks was used to study the spatial distribution of spines along dendrites.
Results
Our study indicated that in the culture system, (i) dendritic spine densities were "completely spatially random", (ii) spine type and distance from the soma were independent quantities, and most importantly, (iii) spines had a tendency to cluster with other spines of the same type.
Conclusions
Although these results may vary with other systems, our primary contribution is the set of statistical tools for morphological modeling of spines which can be used to assess neuronal cultures following gene manipulation such as RNAi, and to study induced pluripotent stem cells differentiated to neurons.
Keywords
Dendritic spines Rat hippocampal culture Linear network Kfunction Morphological modeling Spatial statistics Point processes Neuronal growthBackground
Spines are protrusions that occur on the dendrites of most mammalian neurons. They contain the postsynaptic apparatus and have a role in learning and memory storage. Spine distribution is a critically important question for multiple reasons. Changes in spine distributions and shape have been linked to neurological disorders such as Fragile X syndrome [1]. Spine distributions determine the extent to which the neuropil will be electrically sampled, i.e. dense distributions will sample the neural connectivity map more fully [2]. Furthermore, the nature of optimal sampling is unknown and likely depends on the surrounding anatomy and the total information content available to dendrites. Because pruning takes place during development in an activity dependent manner, spine distributions may reflect activity within neural circuits. Distributions of spine types are biologically important because the electrical properties of spines, such as the spine neck resistance, promote nonlinear dendritic processing and associated forms of plasticity and storage [3] to enhance the computational capabilities of neurons.
The shapes and types of dendritic spines contribute to synaptic plasticity. Because neighboring spines on the same short segment of dendrite can express a full range of structural dimensions, individual spines might act as separate computational units [4]. Nevertheless, the dendrite acts in a coordinated manner and thus the spatiotemporal distributions of different spine types is likely to be significant. Little is known about this population level organization of dendritic spines. Our aim was to perform an exploratory analysis of neuronal data from different time periods during the growth of rat dissociated hippocampal neurons, a wellestablished model system [5]. The observations here pertain only to the culture system and not necessarily to in vivo settings although the analytical tools used here could be adapted to in vivo analyses.
By quantifying populations of dendritic spines with automated tools at a global level, we were able to perform a much larger and more comprehensive analysis than most previous studies. Many studies only analyze a small region of interest on the largest dendrites, for example the 5075 μ m closest to the soma [6], or 10 μm segments [7], making it easier to measure manually the spine type counts and dimensions. Other works determine spine lengths and widths by manually drawing a line along the maximal length and measuring the length of that line [8], and therefore are only able to analyze a few neurons and a few hundred spines at a time.
In this study we determined the ratios of spine types along the dendrites as a function of time in culture, clustering or repulsion of spines in space, and how best to model spine type distributions. A model that fits the spatial distribution of spine types in healthy cultured neurons would be useful to assess neuronal cultures following gene manipulation such as RNAi and to study features of induced pluripotent stem cells differentiated to neurons.
LogLinear Models (LLM) and Multinomial Logistic Regressions (MLR) are two basic and essential statistical methods, and have an extensive history of being used in biological studies. However, these tools have not been used thus far in the analysis of spine distributions. We fit a loglinear model (LLM) to the contingency table of spine features to determine the dependence between spine types (mushroom, thin, and stubby), distance from the cell body along the dendrite (in micrometers), the branch order of the dendritic segment on which it lies (primary, secondary, tertiary, etc.), and the day in vitro (DIV) on which it was imaged. Once we determined which of these attributes contributed to the overall dendritic spine model, we then asked whether these attributes can predict the occurrence of spines and of spine types. To answer this question we used a Multinomial Logistic Regression (MLR) model, which predicted the spine type, using the attributes that were found to be important through the LLM and associated contingency tables.
Finally, to understand how the dendritic spine density varied over the length of the neuron or whether the appearance of spines was completely spatially random i.e uniformly distributed over the neurites, we made use of spatial point processes. Spatial point processes have been used before in biological studies to model the locations of entire neurons [911], locations of ants nests [12] or xylem conduits [13]. There have also been other more adhoc methods created to study the number of "clustered spines" on each dendritic segment in monkey brains, where a cluster is defined as a group of 3 or more spines [14] however we believe our use of the linear network Kfunction [15] is the first work to analyze the locations of dendritic spines and their clustering properties in such a principled manner. Our analysis indicated that the density of spines is generally completely spatially random (CSR) over the dendritic length probably due to the absence of instructive directional signals found an in vivo setting in which spine distributions are unlikely to be CSR.
Methods
Cell imaging
Dissociated hippocampal neurons from embryonic rat brains (E18) were plated onto polyllysine coated coverslips. Once neurons adhered to the coverslip, they were placed facedown on glial cells grown in vitro for 15 days. These neurons were a primary neuronal culture system, and no cell line was used. Neurons were grown for specific time periods up to 21 days in a neuronal medium containing B27. This coculture of neurons and glia mimic the physiological conditions of neuronal growth and development in mammalian brain [5]. Work with the neuronal cultures was approved by the UCSB animal care committee.
To fill the neuronal processes including dendritic spines Green Fluorescent Protein (GFP) was expressed from a plasmid containing the betaactin promoter (CAGGFP) [16]. Of this plasmid, 2 μ g was transfected into each coverslip containing about 50,000 neurons (including about 20% glial cells). Transfection was performed as described in the manufacturer’s protocol (Lipofectamine 2000 from Invitrogen) with minor changes. The transfection mix and neurons were incubated for two hours to avoid toxicity caused by lipo2000. Following transfection, coverslips were flipped back onto the glial dish, where they were originally cultured. GFPactin transfected into the neurons at DIV4 (Day In Vitro) and neurons were studied at three time points DIV7, 14 and 21. These time points survey the maturation period over which synapses and spines emerge [17]. Note that these were not the same neurons studied over time, but each time point represents a different population of neurons which were grown in culture up until the point of imaging. In this way our analysis represents a study at the population level. At each time point the number of images taken per plate depended on the transfection efficiency of that plate. On average approximately 1% of cells were transfected. The plating density was set so that neurons were relatively isolated in order to capture one neuron per image. An Olympus FluoView laser scanning confocal microscope was used. Image slices were 2048 by 2048 pixels at 154n m per pixel resolution. There were 733 zslices per stack depending on the depth of the neuron, taken at 200n m steps. This means that the stacks were 315.39 μm × 315.39 μm × 1.46.6 μm. The z dimension slices were used to capture each depth level at the optimal focus, however we cannot claim to have accurate volumetric information at this resolution. A 40X oil objective lens with no optical zoom was used. Numerical Aperture (NA) was 1.3, and illumination conditions were kept constant. Deconvolution of the raw data before processing was not necessary because the images were clear enough to manually annotate the neuron traces and manually edit all the spine detections and types as described in the following section. We performed three biological replicates, the results of which are detailed below.
Although there are other higher resolution, full volume methods, the analysis of this data is broadly applicable to imaged neurons in other systems [5]. We attempted to capture the entire neuron in each image, however because of limits in available imaging techniques we found that this does not always happen. In the cases where dendrites were truncated at the end of the image plane we assumed that the proportion of spines in the missing data was similar to what had already been observed, and therefore the resulting distributions did not change. We verified this assumption visually by taking tiled mosaics of a few neurons imaged in their entirety from each DIV and checking that the branch orders, distances to soma and spine type counts were unchanged as compared to those of the same DIV. There was an observed increase in the dendritic length truncated by the image plane as the DIV increased. However in our particular analyses the methods used, such as the LogLinear Model and Multinomial Logistic Regression, were focused on trends between spine characteristics such as distance to soma and type and these trends are innately unaffected by the truncation of dendrites given the above assumption. In addition, spatial point process analyses such as the linear network Kfunction always include the specification of an observation window [18], which in our case was the image plane. We verified (see Results and discussion section) that the overall spine density and the density of each spine type did not vary with distance from the soma so that we could assume spine density at the ends of the dendrites which were truncated was similar to the dendritic length which was observed. We recognize that we cannot see the proximity of labeled cells to other neurons which haven’t taken up the GFP labeling. These unlabeled neighboring neurons may cause some difference in spine distributions which we cannot quantify. For this reason we have attempted to quantify our biological findings statistically over entire experiments and DIV time points instead of by individual neurons, although in certain cases showing results from individual randomly sampled neurons was necessary.
Neuronal reconstruction
There exist many automated methods for studying neuronal growth and morphometry and therefore we present a brief review of available software for tracing dendrites and detecting and classifying spines. In particular, NeuronJ [19] is a widely used software; however it is only semi automatic and one must click several points to trace each neurite. The labeling is done manually and the statistics output only include lengths of neurites and not spine data. HCAVision [20] is a costly software with similar goals, however the parameters of the neurite tracing are set manually with a sliding bar and thus results require much handtuning. In addition, it is also focused on tracing neurites as opposed to spine analysis. For a full review of existing methods and softwares for neuron tracing and spine detection see [21]. We found NeuronStudio [2224] to be the most userfriendly, and for this reason we used it to annotate dendrites and spines for this analysis.
Relevant spine attributes output from the NeuronStudio software include branch order (BO), type (stubby, mushroom or thin), distance to soma along dendrite (SD), length (tip of spine to dendrite) and width at widest point (head diameter or HD). However since NeuronStudio uses the length and width of the spines to determine the spine type, we chose to make use of spine type and discard the other 2 measurements. NeuronStudio uses centrifugal labeling for branch orders, meaning it starts at 1 at the cell body and moves outwards, incrementing at every yshaped bifurcation regardless of the diameter of the daughter branches. Note that the entire image stack with zdimension information was loaded into NeuronStudio for the spine classification, and that the software has interpolation algorithms to estimate the spine type in 3D. For spine detection the default cutoffs were used, i.e. a required spine height between 0.23 μm, a maximum spine width of 3 μm, a minimum stubby size of 10 voxels (at the imaging resolution given above), a minimum nonstubby size of 5 voxels, and automatic zsmear compensation. For spine classification, the default settings were also used, i.e. a headtoneck ratio threshold of 1.1 μm, an aspect ratio (spine heighttowidth) threshold of 2.5 μm and a minimum mushroom head size of 0.35 μm. NeuronStudio delineates spine types by these 3 thresholds. It is generally known that mushroom spines have a large head and a narrow neck, thin spines have a small head and a narrow neck, and stubby spines display no obvious subdivision in head and neck. If the headtoneck ratio is above the threshold and the minimum mushroom head size is met, the spine is considered mushroom. If both the headtoneck and aspect ratios are lower than the respective thresholds then the spine is considered stubby. The remaining cases result in thin spines. For further information on NeuronStudio reconstruction, detection, and spine classification algorithms please refer to [22, 23]. In addition to the spine information, a trace file is output which labels the cell body, branch points and end points of the dendrites. The trace provides a skeletonization, or centerline, of the dendrite which we used to compute the linear network distances in the following analyses.
Loglinear model as a tool for exploring important features and their dependencies
To find the most influential attributes with regard to prediction and spatiotemporal modeling of spines we fit a loglinear model to the feature data, which is a type of generalized linear model [26]. The cooccurrence frequencies of the features in question are essentially a large multidimensional contingency table of counts. The standard linear models assume that data is normally distributed around a certain mean, which means that the observations can take any real value, positive, negative, integer or fractional. Loglinear models, on the other hand, assume that data is intrinsically nonnegative, typically counts that could be Poisson distributed, and allow us to model the association and interaction patterns among categorical variables. The attributes under consideration are BO, Type, SD and DIV. Again, since the type of spine was quite directly dependent on the length and the head diameter of the spine, we left these latter variables out of the modeling.
In order to analyze the data using a loglinear model, the various features must be in a categorical form or discretized. In an exploratory analysis such as this, one does not know what dependencies among features to expect; however we would like to note that these dependencies were not lost in the discretization process since trends in increasing and decreasing feature values would be preserved. To ensure that there were a reasonable number of observations at the higher branch orders, we pooled BO values of 5 or higher into a single category called "higherorder branches". We created a categorical variable to represent the continuous variable soma distance (SD) where categories were determined using the 4 quartiles of the SD spine data pooled over all 3 experiments. Specifically, SD values of less than 65.65 μm were classified into the first group, from this value to less than 108.99 μm the second, from this to less than 157.04 μm the third, and the rest (less than the most distal spine which lay at 413.25 μm from the cell body) fell in the fourth group. Binning the observed data for the continuous variables is the best way to get a general feel for how these quantities relate to each other. After this postprocessing of the data we arrived at 5 categories of branch order, 4 categories of soma distance, 3 spine types (mushroom, stubby, and thin), and 3 DIVs (7, 14, and 21 days).
is the maximized value of the likelihood function for the estimated Poisson model. In the above equations $\mathbf{x}={x}_{1},\dots ,{x}_{N}\in {\mathbb{R}}^{4}$ are the input vectors, θ = θ_{1}…θ_{ k } are the parameter values (one per term in eqn. 1), and $\mathbf{y}={y}_{1},\dots ,{y}_{N}\in \mathbb{R}$ is the output. The AIC is a commonly used goodnessoffit measure for a model given the observed data. Adding or subtracting terms, whether they be main effects, pairwise interactions, or up to 3way interactions between attributes, will change the AIC value for the model. A lower AIC criterion indicates a better fit to the data and therefore a better model. To compute the stepwise fit we used the R function 'step’. For more information on the stepwise fit algorithm as well as the AIC criterion we ask that the readers refer to the 'step’ function reference ([27], Chapter 6). We ran both of these LLM fitting procedures for all 3 experiments separately expecting to find general agreement between coefficients of the corresponding models created.
Multinomial logistic regression to predict spine type from neighbor types
In order to predict spine type we first determined which attributes contributed most to spine type prediction. Given the complexity of the multidimensional LLM and the various interactions and conditional frequencies that would impinge on this issue, we decided to determine these attributes by analyzing 2way contingency tables for spine type vs. SD, BO, DIV, as well as the spine types of the 3 nearest neighbors. This analysis helped us pick attributes that would be useful as the predictors in the multinomial logistic regression (MLR) [28] explained below.
When the response variable of a regression takes binary values "Logistic Regression" is used. This is an approach which uses a linear combination of the predictor variables to predict the logodds of a success (the "logit" of the probability). Since our response variable was spine type and it can take 3 values (mushroom, stubby or thin), we needed to use a "Multinomial Logistic Regression" (MLR) which attempts to model the probability of any of multiple possible outcomes. We did not use the attributes SD or BO as predictors variables since the results of both the LLM analysis and 2way contingency tables mentioned above told us that these quantities were not as relevant for spine type prediction. Therefore our model consisted of spine type as the output variable and the DIV, 1st, 2nd and 3rd nearest neighbor type along the dendrite as the predictor variables. We tried using only 1 or 2 nearest neighbors, however the results proved inconclusive because the prediction probabilities for each of the 3 types were predominantly close to 1/3. If we used more than the 3 nearest neighbors we sometimes ended up spanning a segment of dendrite which we did not consider to be "local", so we decided that 3 nearest neighbors provide the most useful information in the case of this study.
Suppose the output variable categories are denoted by 0,1,2 corresponding to mushroom, stubby or thin spines, with 0 being the reference category. If y_{ i } denotes the observed outcome of the output variable (spine type), and X_{ i } is the corresponding vector of the 3 neighbor types and DIV for the ith observation, one regression is run for the logit probability of each category k, with β_{ k } representing the vector of regression coefficients in the kth regression (eqns. 4,5). This is done for all but the reference category, whose probability is then obtained by subtracting all other probabilities from one (eqn. 6). Note that because the predictor variables were spine types, which were nominal as opposed to ordinal variables, the predictor variables X_{ i } must be represented with a "dummy coding". This means each neighbor type was represented by 2 predictor variables, where (1,0) corresponded to mushroom type, (0,1) corresponded to stubby type and (0,0) corresponded to thin type. This does not need to be done for the output variable y. With the addition of the DIV, which does not have to be dummy coded since it is an ordinal variable, this made each X_{ i } vector of length 7.
so that the beta coefficients represent the change in the log odds of the dependent variable being in a particular category with respect to the reference category i.e. the thin type, for a unit change of the corresponding independent variable. To check if the models created from all three experiments were in agreement, we ran the MLR separately for each experiment.
To satisfy one of the major assumptions of this analysis, namely that the data must be a set of independent observations, we took 200 randomly sampled spines of each type from each experiment (600 spines per experiment total) to use for the parameter estimation. We chose to select equal proportions of each spine type in order to remove any bias in the model towards the less frequent thin spines, and 200 was the largest number we could justify using since there were only 649 thin spines in experiment 3. We verified that these randomly sampled spines did not lie within 10 μm of the image border so that we were fairly certain their nearest neighbors did not fall outside of the image plane. Note that due to the tortuosity of the dendritic structure this did not mean that our sample was necessarily biased towards spines which were proximal to the soma. We did not verify explicitly that the sampled spines were not neighbors of each other, since we assumed that the variation captured by the random sampling was enough to ensure some level of independence. The idea was to aim for an independent set of observations which represented the entire "population" of spines in that experiment. To be clear we used all 30,285 spines for the LLM model and Kfunction analysis, only the MLR model required random sampling since we were using neighbor information which would have been redundant if we considered every spine.
To verify that the prediction of spine type provided by the MLR was better than what we would get purely by their relative abundance i.e. without neighboring spine type information, we computed something similar to a "Bayes Factor" [30]. Bayes factor is a method of choosing between two models on the basis of the observed data. In our case, the first prediction model was simply the prior global probability of finding a given spine type based on its frequency in the particular experiment under consideration. The second model was the MLR prediction model using the neighbor type information. We computed P(Y = iX)/P(Y = i) and reasoned that values considerably larger than one indicated the neighboring spine type information was helpful in the prediction of the central spine type.
Linear network Kfunction as a tool for testing spatial point patterns
Originally proposed by Ripley in 1981 [31], the purpose of the Kfunction is to estimate whether or not there is clustering or repulsion present in a given spatial point process. The common null hypothesis is that the points within the observation window are distributed as a homogeneous Poisson process, which is also termed "completely spatially random" or CSR. This means that the density of points does not vary depending on the spatial parameters i.e. x and y in the 2D Euclidean case, or the location along the dendritic network in our case. In order to determine if this is a valid null hypothesis for our data, we created QQ plots [32] for individual dendrites which compared the quantiles of the SD values of observed spines to the theoretical quantiles for the CSR case. If the two distributions (observed and CSR) being compared were similar, the points in the QQ plot would approximately lie on the line y = x. In order to create the theoretical quantiles it is necessary to know the values of SD at any location on the given network, not just at the spine locations. Once we have this we can partition the network into epsilon small segments and assign each segment a value 1 if it contains a spine and 0 otherwise based on the CSR assumptions. We did this using code provided to us by Adrian Baddeley and Gopal Nair at the Commonwealth Scientific and Industrial Research Organization (CSIRO), Australia.
where I stands for the indicator function, and d_{ ij } stands for the Euclidean distance between two points p_{ i } and p_{ j }. In the above equation, we see that the expectation is normalized by 1/λ since $\lambda =\frac{N}{A}$, so we infer that theoretically K(t) = π t^{2} implies spatial independence of points, or a CSR point process. Therefore, if K(t) is the theoretical CSR value of the function and $\hat{K}(t)$ is the observed function, then $\hat{K}(t)>K(t)$ implies clustering between points and $\hat{K}(t)<K(t)$ implies repulsion. It is possible to extend this function to multitype point patterns (i.e. to find clustering or repulsion between specific spine types) or to higher dimensional data (i.e. spacetime, or 3D Euclidean space).
where m(i,d_{ ij }) is the number of points of L lying at the exact distance t away from the point i measured by the shortest path. That is, the contribution to the function from each pair of points (i,j) is weighted by the reciprocal of the number of points that are situated at the same distance from i as j is. As a result, the theoretical CSR case is simply K(t) = t for all 0 ≤ t < T. This enables direct comparison of tvalues across dendrites, as we will see in the results section.
Simulations and qvalues
One method for obtaining a distribution of d proposed by Diggle [36] is to bootstrap the residuals, or differences between the observed and theoretical values. However a more heuristic and intuitive way is to simulate the CSR case for each dendrite, compute the Kfunction for each of these simulations, and find the simulated distribution of our test statistic. We then found the pvalue of the observed difference d from this simulated distribution.
Specifically, we carried out 1000 CSR simulations for each dendrite by placing uniform points on a line [0,ℓ_{ T }], and mapping them to that specific dendrite’s linear network structure. The number of points simulated per dendrite equaled the number of observed spines for that dendrite, thus preserving the overall density λ. This means the same number of spines that existed on each dendrite were randomly placed along the linear network specific to that dendrite. We used these simulations to obtain 1000 values of the summary statistic, say d[i]. Then the pvalue for each dendrite was simply the proportion of simulated values that fell above the observed or experimental value of d, i.e. the rank of this d within the 1000 values of d[i], or nrank/(nsim + 1).
This pvalue approach is similar to the test which rejects the null hypothesis if the graph of the observed Kfunction lies outside the "pointwise simulation envelope" at any value of t. A simulation envelope is essentially a graphical measure of how far a function can deviate from the theoretical value without being considered significant at a given level. As mentioned above in our case the envelope is calculated by first creating the 1000 CSR simulations of a point pattern on a given dendritic network with the same observed network intensity, then calculating the linear Kfunction for each of these 1000 simulations. To perform a twosided significance test at the 10% level, the 5% and 95% percentiles are then calculated based off the 50 lowest and 50 highest linear Kfunction values per tvalue, hence the term "pointwise". Plotting these values as a function of t gives one a visual idea of the spread that is produced by chance mechanisms alone. If the observed Kfunction for a given tvalue does not fall outside these percentiles, it is considered insignificant for that tvalue at the 10% significance level. We make use of the R package 'spatstat’ [18] for obtaining the pointwise simulation envelope.
Controlling the overall FDR, or expected proportion of incorrectly rejected null hypotheses termed "false discoveries", is a statistical method commonly used in multiple hypothesis testing which increases the statistical power of each test. What is more general and useful however, is a testspecific FDR measure. This essentially allows us to look at all possible significance thresholds at once, as well as provide each test with a measure of significance that can be easily interpreted. This is accomplished by calculating an analogue of the pvalue for each test called a "qvalue" [38]. A pvalue of 0.05 implies that 5% of all tests will result in false positives, whereas a qvalue of 0.05 implies that 5% of significant tests will result in false positives. Since the latter is clearly a far smaller quantity, qvalues generally indicate fewer significant tests than pvalues for a given significance threshold and provide a far more accurate indication of the level of false positives in the case of multiple hypothesis testing. For qvalue estimation we used the qvalue package available from [39].
Results and discussion
Data analyzed
Number of neurons collected per experiment and DIV
EXP  DIV7  DIV14  DIV21 

1  8  9  7 
2  10  10  10 
3  7  7  7 
Number of each type of spine per experiment
EXP  Mushroom  Stubby  Thin 

1  4035  3224  1915 
2  5400  6619  2570 
3  2388  3485  649 
We believe that the large experimental variation between spine type proportions and counts in each experiment was a positive result because this meant that statistical agreement across all 3 experiments relating to spine type clustering and density estimation carries heavier weight than if the 3 experiments were more uniform in these quantities, or if we had pooled data from all 3 experiments together. Also, if all 3 experiments were unusually homogeneous there could be a possibility that it is a result of our specific culturing, imaging or spine extraction methods rather than a true representation of the underlying biological process. The various biological systems to which these techniques will be applied will certainly have this type of variability.
Spine type is independent of distance from soma
EXP 1 stepwise final model: freq ∼ div + type + bo + sd + bo ·sd + div ·bo + div ·type + div ·sd + type ·bo + type ·sd, AIC = 1557.05
Df  Deviance  AIC  

None  530.4  1557.1  
Omit type ·sd term  6  545.0  1559.6 
Omit type ·bo term  8  558.8  1569.4 
Omit div ·sd term  6  569.6  1584.2 
Omit div ·type term  4  648.0  1666.6 
Omit div ·bo term  8  1324.1  2334.7 
Omit bo ·sd term  12  4142.4  5145.0 
EXP 2 stepwise final model: freq ∼ div + type + bo + sd + bo ·sd + div ·bo + div ·sd + div ·type, AIC = 1243.13
Df  Deviance  AIC  

None  470.2  1243.1  
Add type ·sd term  6  461.3  1246.3 
Add type ·bo term  8  465.5  1254.4 
Omit div ·type term  4  610.4  1375.3 
Omit div ·sd term  6  696.0  1456.9 
Omit div ·bo term  8  906.5  1663.5 
Omit bo ·sd term  12  5208.2  5957.1 
EXP 3 stepwise final model: freq ∼ div + type + bo + sd + bo ·sd + div ·sd + div ·type + div ·bo + type ·sd + type ·bo, AIC = 1441.29
Df  Deviance  AIC  

None  482.24  1441.3  
Omit type ·bo term  8  522.95  1466.0 
Omit type ·sd term  6  542.08  1489.1 
Omit div ·bo term  8  606.34  1549.4 
Omit div ·type term  4  630.62  1581.7 
Omit div ·sd term  6  715.38  1662.4 
Omit bo ·sd term  12  2825.69  3760.7 
Despite the fact that they were included in the final stepwise fit model for experiments 1 and 3, the AIC values in Tables 3, 4 and 5 show that in all 3 experiments the interaction between spine type and soma distance ("type ·sd") as well as spine type and branch order ("type ·bo") were the least important in modeling the overall frequency table of occurrences. This implies that the correlation between these quantities was not very high, therefore we reason that it was not necessary to use either SD or BO to predict the spine type in the MLR created in the following section. We also noticed that the term marking the interaction between BO and SD was the most important pairwise term in all stepwise fit models. It is expected that BO and SD are correlated because both necessarily increase as we move away from the cell body. Indeed, running a 2way Chisquare test on the contingency table of the discretized versions of these variables showed us high dependence, χ^{2}(d f = 12, N = 30285) = 11635.19,p<0.0001. We also saw a high level of dependence between DIV and SD (χ^{2}(d f = 6, N = 30285) = 681.76, p < 0.0001) and between DIV and BO (χ^{2}(d f = 8, N = 30285) = 1604.75, p < 0.0001). This was intuitive as well since we expect both BO and SD to generally increase with DIV.
It is possible that the Type vs. SD relationship could have also been estimated using a Sholltype analysis ([42]) where we count the number of each type occurring within concentric circles from the soma and verify that it is constant, however this would not necessarily produce the same results. The crucial difference between our approach and the Sholl approach is that in our approach the "distance from soma measures" the actual distance along the centerline of the dendrite instead of the radial distance from the cell center. This is especially important for dendrites with high tortuosity (which we find prevalent in our data), since the radial distance in those cases will not correspond to the dendritic distance from the cell body. Many studies of cultured neurons use Sholl analysis, however they use it in its original form for counting dendritic intersections and do not comment on the relation to spine density or type. To our knowledge this is the first study to quantify the spine density vs. distance to the soma in dissociated neuronal cultures.
Threeway and 4way interactions are generally known to be weak (not as explanatory as the main effects and 2nd order interactions) and difficult to interpret, however in the interest of exploring all possibilities we computed the maximum likelihood fit using all 4 attributes as well as a stepwise fit model which allows for 3way interactions between attributes. The table presented in Additional file 1 results from the LLM which models all possible interactions of all 4 attributes, i.e. up to the fourth order. The coefficients presented in the table are those which were significant at the 0.1% level, and the corresponding pvalues are shown in the last column. The table contains the interactions which were more important to the model, and shows that of these interactions only one (highlighted in green) between type and either BO or SD, was shown as being significant over all experiments. This verifies once again that neither BO nor SD were highly correlated with the spine type. In addition to this, the stepwise fit models in Additional file 2 show that if we did allow 3rd order interactions, the strongest 3rd order correlation over all experiments was that of DIV, SD and BO, again affirming that all 3 of these quantities should intuitively increase together.
Spines tend to cluster with other spines of the same type
Chisquare results for spine type vs. other attributes
EXP1, N = 9174  EXP2, N = 14589  EXP3, N = 6522  

Type ·SD, d f = 6  χ^{2} = 9.13,p = 0.1665  χ^{2} = 33.64,p < 0.0001  χ^{2} = 25.08,p = 0.0003302 
Type ·BO, d f = 8  χ^{2} = 29.02,p = 0.0003147  χ^{2} = 12.39,p = 0.1348  χ^{2} = 26.53,p = 0.0008516 
Type ·DIV, d f = 4  χ^{2} = 119.78,p < 0.0001  χ^{2} = 358.25,p < 0.0001  χ^{2} = 139.28,p < 0.0001 
Type ·N1, d f = 4  χ^{2} = 225.93,p < 0.0001  χ^{2} = 212.87,p < 0.0001  χ^{2} = 246.74,p < 0.0001 
Type ·N2, d f = 4  χ^{2} = 163.67,p < 0.0001  χ^{2} = 226.31,p < 0.0001  χ^{2} = 127.91,p < 0.0001 
Type ·N3, d f = 4  χ^{2} = 90.33,p < 0.0001  χ^{2} = 153.11,p < 0.0001  χ^{2} = 131.96,p < 0.0001 
MLR beta coefficients for all 3 experiments
EXP1  (Intercept)  N1Var1  N1Var2  N2Var1  N2Var2  N3Var1  N3Var2  DIV 

Stubby  0.06  0.04  0.47  0.52  0.10  0.09  0.25  0.01 
Thin  1.05  0.57  0.34  0.84  0.57  0.23  0.32  0.00 
EXP2  (Intercept)  N1Var1  N1Var2  N2Var1  N2Var2  N3Var1  N3Var2  DIV 
Stubby  0.08  0.03  0.67  0.14  0.05  0.20  0.09  0.02 
Thin  0.25  0.76  0.17  0.61  0.37  0.06  0.05  0.02 
EXP3  (Intercept)  N1Var1  N1Var2  N2Var1  N2Var2  N3Var1  N3Var2  DIV 
Stubby  0.36  0.24  0.33  0.14  0.19  0.03  0.30  0.01 
Thin  0.35  0.66  0.58  0.33  0.28  0.25  0.33  0.02 
Prediction Probabilities: N1 = mushroom, N2 = mushroom, N3 = mushroom
DIV7  EXP  P(mushroom)  P(stubby)  P(thin) 

1  0.45*  0.30  0.25  
2  0.51*  0.35  0.13  
3  0.54*  0.27  0.20  
DIV14  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.45*  0.28  0.26  
2  0.55*  0.33  0.12  
3  0.54*  0.28  0.18  
DIV21  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.46*  0.27  0.27  
2  0.59*  0.30  0.11  
3  0.54*  0.30  0.16 
Prediction probabilities: N1 = stubby, N2 = stubby, N3 = stubby
DIV7  EXP  P(mushroom)  P(stubby)  P(thin) 

1  0.24  0.55*  0.21  
2  0.30  0.52*  0.18  
3  0.32  0.55*  0.12  
DIV14  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.25  0.53*  0.22  
2  0.33  0.50*  0.17  
3  0.32  0.58*  0.11  
DIV21  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.26  0.51*  0.23  
2  0.37  0.47*  0.16  
3  0.31  0.60*  0.09 
Prediction Probabilities: N1 = thin, N2 = thin, N3 = thin
DIV7  EXP  P(mushroom)  P(stubby)  P(thin) 

1  0.20  0.20  0.60*  
2  0.33  0.31  0.36*  
3  0.33  0.25  0.42*  
DIV14  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.20  0.19  0.61*  
2  0.37*  0.30  0.34  
3  0.34  0.27  0.39*  
DIV21  EXP  P(mushroom)  P(stubby)  P(thin) 
1  0.20  0.17  0.62*  
2  0.41*  0.28  0.32  
3  0.34  0.30  0.36* 
Bayes factors
BF(mushroom): N1 = mushroom, N2 = mushroom, N3 = mushroom  

EXP  DIV7  DIV14  DIV21 
1  1.02  1.03  1.05 
2  1.39  1.49  1.60 
3  1.47  1.47  1.47 
BF(stubby): N1 = stubby, N2 = stubby, N3 = stubby  
EXP  DIV7  DIV14  DIV21 
1  1.56  1.50  1.44 
2  1.15  1.10  1.05 
3  1.03  1.08  1.12 
BF(thin): N1 = thin, N2 = thin, N3 = thin  
EXP  DIV7  DIV14  DIV21 
1  2.85  2.91  2.98 
2  2.04  1.92  1.79 
3  4.22  3.87  3.54 
Dendritic spine densities are completely spatially random
Conclusions
The models used in this work allow spatial prediction of spine types, which has not previously been studied. The conclusions presented here relate to qualities of neurons in dissociated culture. We acknowledge that some of these results will most likely not hold for in vivo settings due to neuronal interactions not modeled here, but maintain that the statistical methods used here will be useful and easily applicable. Specifically, we found here that spine type and density are not dependent on the distance from the cell body, and these observations are likely to change for in vitro slices or microinjection of fixed brain tissue.
We also note that we chose not to deconvolve our data because of its high contrast. We acknowledge that this choice may have precluded the image analysis software from detecting some stubby spines among the halo of the bright dendrites, but we do not feel this significantly impacted our results. As a partial compensation for this effect we used NeuronStudio’s inbuilt automatic zsmear compensation, and for more details on this we refer the reader to [22, 23].
Although in this study the spine distributions seemed to be completely spatially random it is possible that we will find studies using different neuronal types and treatments where this is not true. In these cases, where spine density may vary with distance from the cell body, it would be interesting to test for inhomogeneous patterns of points such as the hard core Strauss Process used in [43]. We could also place an exponentially decaying function to model the interaction between spine types within a certain radius or experiment with other pairwise interaction functions such as those used by Diggle, Gates and Stibbard [44] or Diggle and Gratton in [45].
We find it an interesting result that spines were not spatially clustered when type was disregarded, as shown by the linear network Kfunction analysis, however spine types do tend to group together as shown by the MLR analysis. We would like to note that these results are not contradictory because they are in fact measuring different quantities. The MLR results tells us that, regardless of their densities along the dendrite, if we have a spine which is of a given type, its 3 nearest neighbors are likely to be of the same type. The Kfunction, on the other hand, tells us that regardless of type the spines’ locations along the dendritic network are spatially random. These two results provide complementary information and together could aid us in future modeling tasks such as simulation of neuronal growth. For example, we could first place spines uniformly along the dendritic network, and then decide the types of those spines based on the type of information given by the MLR model. As future work we plan to analyze the network cross Kfunction [15] of the dendritic network, which models the spine distribution as a multitype point process and therefore provides information about repulsion and clustering of each spine type with each other spine type, modeling both density and type simultaneously.
Generally previous studies such as [4649] have relied on physiology or biochemical markers to validate their neuronal properties. The quantitative morphological features described here provide an additional phenotypic dimension for these analyses. Likewise these approaches can be applied to phenotypic analyses of neuronal cultures following overexpression or suppression of specific genes to capture their effect on a complex phenotype. As mentioned in the Introduction section, the only other study we are aware of which analyzes clustering of dendritic spines in monkey brains is [14]. The authors of this work study the number of "clustered spines" on each dendritic segment, where a cluster is defined as a group of 3 or more spines. The method used here defines clustering as a statistically significant positive deviation in the linear Kfunction from the theoretical value of the spatially random linear Kfunction. We believe our method to be more principled and our results easier to interpret than those of [14] due to the more formal statistical definition of clustering.
We chose to use dissociated hippocampal cultures because they are widely used and they allow us to perform an indepth and automated analysis with larger spine populations than most previous studies. These approaches will be important in assessing features of neurons derived from human induced pluripotent stem cells which have so far not been characterized by detailed morphological features. Our paper utilized a highly simplified neuronal culture system to develop the statistical and computational tools for more advanced in vivo studies needed to address the aforementioned bigger biological questions. Our overall hypothesis was that we can utilize imaging and statistical analyses to capture features of spine distributions that can be used for testing hypotheses in invivo settings. Indeed, we have been conservative about hypotheses and findings concerning spine type clustering because any conclusions we might reach on the specifics of spine distribution would be limited to the neuronal culture system we studied.
Availability of supporting data
All the image stacks and NeuronStudio annotation files supporting the results of this article are available in the BISQUE repository, http://bisque.ece.ucsb.edu/client_service/view?resource=http://bisque.ece.ucsb.edu/data_service/dataset/2653471.
Abbreviations
 DIV:

Day in vitro
 LLM:

Loglinear model
 SD:

Distance from the soma to the spine along the centerline of the dendrite
 BO:

Branch order of the dendritic segment on which a spine lies
 N1, N2, N3:

The spine types of the 1st, 2nd, and 3rd nearest neighbors along the dendrite, respectively
 MLR:

Multinomial logistic regression
 CSR:

Completely spatially random
 FDR:

False discovery rate
 MAD:

Maximum absolute deviation.
Declarations
Acknowledgements
We would like to acknowledge Sruti Aiyaswamy for her help with the neuron annotation, and the StatLab at UCSB for checking over the statistical analyses included here. This work was supported by the Larry L. Hillblom Foundation, the Tau Consortium and an NSF award III0808772.
Authors’ Affiliations
References
 Irwin S, Patel B, Idupulapati M, Harris J, Crisostomo R, Larsen B, Kooy F, Willems P, Cras P, Kozlowski P, et al: Abnormal dendritic spine characteristics in the temporal and visual cortices of patients with fragileX syndrome: A quantitative examination. Am J Med Genet. 2001, 98 (2): 161167. 10.1002/10968628(20010115)98:2<161::AIDAJMG1025>3.0.CO;2B.View ArticlePubMed
 Yuste R: Dendritic spines and distributed circuits. Neuron. 2011, 71 (5): 772781. 10.1016/j.neuron.2011.07.024.PubMed CentralView ArticlePubMed
 Harnett M, Makara J, Spruston N, Kath W, Magee J: Synaptic amplification by dendritic spines enhances input cooperativity. Nature. 2012, 491: 599602. 10.1038/nature11554.PubMed CentralView ArticlePubMed
 Harris K, Kater S: Dendritic spines: cellular specializations imparting both stability and flexibility to synaptic function. Annu Rev Neurosci. 1994, 17: 341371. 10.1146/annurev.ne.17.030194.002013.View ArticlePubMed
 Kaech S, Banker G: Culturing hippocampal neurons. Nature Protoc. 2007, 1 (5): 24062415.View Article
 Mukai J, Dhilla A, Drew L, Stark K, Cao L, MacDermott A, Karayiorgou M, Gogos J: Palmitoylationdependent neurodevelopmental deficits in a mouse model of 22q11 microdeletion. Nat Neurosci. 2008, 11 (11): 13021310. 10.1038/nn.2204.PubMed CentralView ArticlePubMed
 Cheetham C, Hammond M, McFarlane R, Finnerty G: Altered sensory experience induces targeted rewiring of local excitatory connections in mature neocortex. J Neurosci. 2008, 28 (37): 92499260. 10.1523/JNEUROSCI.297408.2008.PubMed CentralView ArticlePubMed
 Schratt G, Tuebing F, Nigh E, Kane C, Sabatini M, Kiebler M, Greenberg M: A brainspecific microRNA regulates dendritic spine development. Nature. 2006, 439 (7074): 283289. 10.1038/nature04367.View ArticlePubMed
 Mamaghani MJ, Andersson M, Krieger P: Spatial point pattern analysis of neurons using Ripley’s Kfunction in 3D. Front Neuroinformatics. 2010, 4 (0): 110.
 Bell M, Grunwald G: Mixed models for the analysis of replicated spatial point patterns. Biostatistics. 2004, 5 (4): 633648. 10.1093/biostatistics/kxh014.View ArticlePubMed
 Millet L, Collens M, Perry G, Bashir R: Pattern analysis and spatial distribution of neurons in culture. Integr Biol. 2011, 3 (12): 11671178. 10.1039/c1ib00054c.View Article
 Harkness R, Isham V: A bivariate spatial point pattern of ants’ nests. Appl Stat. 1983, 32 (3): 293303. 10.2307/2347952.View Article
 Mencuccini M, MartinezVilalta J, Piñol J, Loepfe L, Burnat M, Alvarez X, Camacho J, Gil D: A quantitative and statistically robust method for the determination of xylem conduit spatial distribution. Am J Bot. 2010, 97 (8): 12471259. 10.3732/ajb.0900289.View ArticlePubMed
 Yadav A, Gao Y, Rodriguez A, Dickstein D, Wearne S, Luebke J, Hof P, Weaver C: Morphologic evidence for spatially clustered spines in apical dendrites of monkey neocortical pyramidal cells. J Comp Neurol. 2012, 520: 28882902. 10.1002/cne.23070.PubMed CentralView ArticlePubMed
 Okabe A, Yamada I: The Kfunction method on a network and its computational implementation. Geogr Anal. 2001, 33 (3): 271290.View Article
 Zhao C, Teng E, Summers R, Ming G, Gage F: Distinct morphological stages of dentate granule neuron maturation in the adult mouse hippocampus. J Neurosci. 2006, 26: 311. 10.1523/JNEUROSCI.364805.2006.View ArticlePubMed
 Banker G, Goslin K: Culturing Nerve Cells. 1998, Cambridge, MA USA: MIT press
 Baddeley A, Turner R: Spatstat: an R package for analyzing spatial point patterns. J Stat Softw. 2005, 12 (6): 142. [http://www.jstatsoft.org, ISSN: 15487660],View Article
 Meijering E, Jacob M, Sarria JCF, Steiner P, Hirling H, Unser M: Design and validation of a tool for neurite tracing and analysis in fluorescence microscopy images. Cytom Part A. 2004, 58 (2): 167176.View Article
 Vallotton P, Lagerstrom R, Sun C, Buckley M, Wang D, Silva MD, Tan SS, Gunnersen J: Automated analysis of neurite branching in cultured cortical neurons using HCAvision. Cytom Part A. 2007, 71 (10): 889895.View Article
 Meijering E: Neuron tracing in perspective. Cytom Part A. 2010, 77 (7): 693704.View Article
 Rodriguez A, Ehlenberger D, Dickstein D, Hof P, Wearne S: Automated threedimensional detection and shape classification of dendritic spines from fluorescence microscopy images. PLoS ONE. 2008, 3 (4): e199710.1371/journal.pone.0001997. doi:10.1371/journal.pone.0001997.PubMed CentralView ArticlePubMed
 Wearne S, Rodriguez A, Ehlenberger D, Rocher A, Hendersion S, Hof P: New Techniques for imaging, digitization and analysis of threedimensional neural morphology on multiple scales. Neuroscience. 2005, 136: 661680. 10.1016/j.neuroscience.2005.05.053.View ArticlePubMed
 Dumitriu D, Rodriguez A, Morrison J: Highthroughput, detailed, cellspecific neuroanatomy of dendritic spines using microinjection and confocal microscopy. Nat Protoc. 2011, 6 (9): 13911411. 10.1038/nprot.2011.389.PubMed CentralView ArticlePubMed
 The DIADEM Scientific Challenge. http://diademchallenge.org/. [Accessed: 30/09/2012].
 McCullagh P, Nelder J: Generalized Linear Models. 1989, London: Chapman & Hall/CRCView Article
 Chambers J, Hastie T, et al: Statistical Models in S. 1992, London: Chapman & Hall
 Hilbe J: Logistic Regression Models. 2009, London: CRC Press
 Venables WN, Ripley BD: Modern Applied Statistics With S, fourth edition. 2002, New York: Springer, http://www.stats.ox.ac.uk/pub/MASS4. [ISBN 0387954570].View Article
 Kass R, Raftery A: Bayes factors. J Am Stat Assoc. 1995, 90 (430): 773795. 10.1080/01621459.1995.10476572.View Article
 Ripley B: Spatial Statistics, Volume 24. 1981, New York, NY, USA: Wiley Online LibraryView Article
 Wilk M, Gnanadesikan R: Probability plotting methods for the analysis for the analysis of data. Biometrika. 1968, 55: 117.PubMed
 Govindarajan A, Israely I, Huang SY, Tonegawa S: The dendritic branch is the preferred integrative unit for protein synthesisdependent LTP. Neuron. 2011, 69: 132146. 10.1016/j.neuron.2010.12.008.PubMed CentralView ArticlePubMed
 Harvey CD, Svoboda K: Locally dynamic synaptic learning rules in pyramidal neuron dendrites. Nature. 2007, 450 (7173): 11951200. 10.1038/nature06416.PubMed CentralView ArticlePubMed
 Ang Q, Baddeley A, Nair G: Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scand J Stat. 2011, 39: 591617.View Article
 Diggle PJ: Statistical Analysis of Spatial Point Patterns. 2003, New York: Oxford University Press Inc.
 Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological). 1995, 57 (1): 289300.
 Storey J: A direct approach to false discovery rates. J R Stat Soc: Ser B (Statistical Methodology). 2002, 64 (3): 479498. 10.1111/14679868.00346.View Article
 Dabney A, Storey JD, Warnes GR: qvalue: Qvalue Estimation for False Discovery Rate Control. http://bioconductor.org/packages/2.12/bioc/html/qvalue.html [R package version 1.28.0].
 Kvilekval K, Fedorov D, Obara B, Singh A, Manjunath B: Bisque: a platform for bioimage analysis and management. Bioinformatics. 2010, 26 (4): 544552. 10.1093/bioinformatics/btp699. http://vision.ece.ucsb.edu/publications/kvilekval_Bioinformatics_2010.pdf,View ArticlePubMed
 Ascoli G, Donohue D, Halavi M: NeuroMorpho. Org: a central resource for neuronal morphologies. J Neurosci. 2007, 27 (35): 92479251. 10.1523/JNEUROSCI.205507.2007.View ArticlePubMed
 Sholl D: Dendritic organization in the neurons of the visual and motor cortices of the cat. J Anat. 1953, 87 (Pt 4): 387PubMed CentralPubMed
 Baddeley A, Turner R: Practical maximum pseudolikelihood for spatial point patterns. Aust N Z J Stat. 2000, 42 (3): 283322. 10.1111/1467842X.00128.View Article
 Diggle P, Gates D, Stibbard A: A nonparametric estimator for pairwiseinteraction point processes. Biometrika. 1987, 74 (4): 763770. 10.1093/biomet/74.4.763.View Article
 Diggle P, Gratton R: Monte Carlo methods of inference for implicit statistical models. J R Stat Soc Ser B (Methodological). 1984, 46 (2): 193227.
 Qiang L, Fujita R, Yamashita T, Angulo S, Rhinn H, Rhee D, Doege C, Chau L, Aubry L, Vanti W, et al: Directed conversion of Alzheimer’s disease patient skin fibroblasts into functional neurons. Cell. 2011, 146 (3): 359371. 10.1016/j.cell.2011.07.007.PubMed CentralView ArticlePubMed
 Brennand K, Simone A, Jou J, GelboinBurkhart C, Tran N, Sangar S, Li Y, Mu Y, Chen G, Yu D, et al: Modelling schizophrenia using human induced pluripotent stem cells. Nature. 2011, 473 (7346): 221225. 10.1038/nature09915.PubMed CentralView ArticlePubMed
 Egawa N, Kitaoka S, Tsukita K, Naitoh M, Takahashi K, Yamamoto T, Adachi F, Kondo T, Okita K, Asaka I, et al: Drug screening for ALS using patientspecific induced pluripotent stem cells. Sci Transl Med. 2012, 4 (145): 145ra104145ra104. 10.1126/scitranslmed.3004052.PubMed
 Marchetto M, Gage F: Modeling brain disease in a dish: really?. Cell Stem Cell. 2012, 10 (6): 642645. 10.1016/j.stem.2012.05.008.View ArticlePubMed
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.