 Research article
 Open Access
 Published:
Statistical analysis of dendritic spine distributions in rat hippocampal cultures
BMC Bioinformatics volume 14, Article number: 287 (2013)
Abstract
Background
Dendritic spines serve as key computational structures in brain plasticity. Much remains to be learned about their spatial and temporal distribution among neurons. Our aim in this study was to perform exploratory analyses based on the population distributions of dendritic spines with regard to their morphological characteristics and period of growth in dissociated hippocampal neurons. We fit a loglinear model to the contingency table of spine features such as spine type and distance from the soma to first determine which features were important in modeling the spines, as well as the relationships between such features. A multinomial logistic regression was then used to predict the spine types using the features suggested by the loglinear model, along with neighboring spine information. Finally, an important variant of Ripley’s Kfunction applicable to linear networks was used to study the spatial distribution of spines along dendrites.
Results
Our study indicated that in the culture system, (i) dendritic spine densities were "completely spatially random", (ii) spine type and distance from the soma were independent quantities, and most importantly, (iii) spines had a tendency to cluster with other spines of the same type.
Conclusions
Although these results may vary with other systems, our primary contribution is the set of statistical tools for morphological modeling of spines which can be used to assess neuronal cultures following gene manipulation such as RNAi, and to study induced pluripotent stem cells differentiated to neurons.
Background
Spines are protrusions that occur on the dendrites of most mammalian neurons. They contain the postsynaptic apparatus and have a role in learning and memory storage. Spine distribution is a critically important question for multiple reasons. Changes in spine distributions and shape have been linked to neurological disorders such as Fragile X syndrome [1]. Spine distributions determine the extent to which the neuropil will be electrically sampled, i.e. dense distributions will sample the neural connectivity map more fully [2]. Furthermore, the nature of optimal sampling is unknown and likely depends on the surrounding anatomy and the total information content available to dendrites. Because pruning takes place during development in an activity dependent manner, spine distributions may reflect activity within neural circuits. Distributions of spine types are biologically important because the electrical properties of spines, such as the spine neck resistance, promote nonlinear dendritic processing and associated forms of plasticity and storage [3] to enhance the computational capabilities of neurons.
The shapes and types of dendritic spines contribute to synaptic plasticity. Because neighboring spines on the same short segment of dendrite can express a full range of structural dimensions, individual spines might act as separate computational units [4]. Nevertheless, the dendrite acts in a coordinated manner and thus the spatiotemporal distributions of different spine types is likely to be significant. Little is known about this population level organization of dendritic spines. Our aim was to perform an exploratory analysis of neuronal data from different time periods during the growth of rat dissociated hippocampal neurons, a wellestablished model system [5]. The observations here pertain only to the culture system and not necessarily to in vivo settings although the analytical tools used here could be adapted to in vivo analyses.
By quantifying populations of dendritic spines with automated tools at a global level, we were able to perform a much larger and more comprehensive analysis than most previous studies. Many studies only analyze a small region of interest on the largest dendrites, for example the 5075 μ m closest to the soma [6], or 10 μm segments [7], making it easier to measure manually the spine type counts and dimensions. Other works determine spine lengths and widths by manually drawing a line along the maximal length and measuring the length of that line [8], and therefore are only able to analyze a few neurons and a few hundred spines at a time.
In this study we determined the ratios of spine types along the dendrites as a function of time in culture, clustering or repulsion of spines in space, and how best to model spine type distributions. A model that fits the spatial distribution of spine types in healthy cultured neurons would be useful to assess neuronal cultures following gene manipulation such as RNAi and to study features of induced pluripotent stem cells differentiated to neurons.
LogLinear Models (LLM) and Multinomial Logistic Regressions (MLR) are two basic and essential statistical methods, and have an extensive history of being used in biological studies. However, these tools have not been used thus far in the analysis of spine distributions. We fit a loglinear model (LLM) to the contingency table of spine features to determine the dependence between spine types (mushroom, thin, and stubby), distance from the cell body along the dendrite (in micrometers), the branch order of the dendritic segment on which it lies (primary, secondary, tertiary, etc.), and the day in vitro (DIV) on which it was imaged. Once we determined which of these attributes contributed to the overall dendritic spine model, we then asked whether these attributes can predict the occurrence of spines and of spine types. To answer this question we used a Multinomial Logistic Regression (MLR) model, which predicted the spine type, using the attributes that were found to be important through the LLM and associated contingency tables.
Finally, to understand how the dendritic spine density varied over the length of the neuron or whether the appearance of spines was completely spatially random i.e uniformly distributed over the neurites, we made use of spatial point processes. Spatial point processes have been used before in biological studies to model the locations of entire neurons [911], locations of ants nests [12] or xylem conduits [13]. There have also been other more adhoc methods created to study the number of "clustered spines" on each dendritic segment in monkey brains, where a cluster is defined as a group of 3 or more spines [14] however we believe our use of the linear network Kfunction [15] is the first work to analyze the locations of dendritic spines and their clustering properties in such a principled manner. Our analysis indicated that the density of spines is generally completely spatially random (CSR) over the dendritic length probably due to the absence of instructive directional signals found an in vivo setting in which spine distributions are unlikely to be CSR.
Methods
Cell imaging
Dissociated hippocampal neurons from embryonic rat brains (E18) were plated onto polyllysine coated coverslips. Once neurons adhered to the coverslip, they were placed facedown on glial cells grown in vitro for 15 days. These neurons were a primary neuronal culture system, and no cell line was used. Neurons were grown for specific time periods up to 21 days in a neuronal medium containing B27. This coculture of neurons and glia mimic the physiological conditions of neuronal growth and development in mammalian brain [5]. Work with the neuronal cultures was approved by the UCSB animal care committee.
To fill the neuronal processes including dendritic spines Green Fluorescent Protein (GFP) was expressed from a plasmid containing the betaactin promoter (CAGGFP) [16]. Of this plasmid, 2 μ g was transfected into each coverslip containing about 50,000 neurons (including about 20% glial cells). Transfection was performed as described in the manufacturer’s protocol (Lipofectamine 2000 from Invitrogen) with minor changes. The transfection mix and neurons were incubated for two hours to avoid toxicity caused by lipo2000. Following transfection, coverslips were flipped back onto the glial dish, where they were originally cultured. GFPactin transfected into the neurons at DIV4 (Day In Vitro) and neurons were studied at three time points DIV7, 14 and 21. These time points survey the maturation period over which synapses and spines emerge [17]. Note that these were not the same neurons studied over time, but each time point represents a different population of neurons which were grown in culture up until the point of imaging. In this way our analysis represents a study at the population level. At each time point the number of images taken per plate depended on the transfection efficiency of that plate. On average approximately 1% of cells were transfected. The plating density was set so that neurons were relatively isolated in order to capture one neuron per image. An Olympus FluoView laser scanning confocal microscope was used. Image slices were 2048 by 2048 pixels at 154n m per pixel resolution. There were 733 zslices per stack depending on the depth of the neuron, taken at 200n m steps. This means that the stacks were 315.39 μm × 315.39 μm × 1.46.6 μm. The z dimension slices were used to capture each depth level at the optimal focus, however we cannot claim to have accurate volumetric information at this resolution. A 40X oil objective lens with no optical zoom was used. Numerical Aperture (NA) was 1.3, and illumination conditions were kept constant. Deconvolution of the raw data before processing was not necessary because the images were clear enough to manually annotate the neuron traces and manually edit all the spine detections and types as described in the following section. We performed three biological replicates, the results of which are detailed below.
Although there are other higher resolution, full volume methods, the analysis of this data is broadly applicable to imaged neurons in other systems [5]. We attempted to capture the entire neuron in each image, however because of limits in available imaging techniques we found that this does not always happen. In the cases where dendrites were truncated at the end of the image plane we assumed that the proportion of spines in the missing data was similar to what had already been observed, and therefore the resulting distributions did not change. We verified this assumption visually by taking tiled mosaics of a few neurons imaged in their entirety from each DIV and checking that the branch orders, distances to soma and spine type counts were unchanged as compared to those of the same DIV. There was an observed increase in the dendritic length truncated by the image plane as the DIV increased. However in our particular analyses the methods used, such as the LogLinear Model and Multinomial Logistic Regression, were focused on trends between spine characteristics such as distance to soma and type and these trends are innately unaffected by the truncation of dendrites given the above assumption. In addition, spatial point process analyses such as the linear network Kfunction always include the specification of an observation window [18], which in our case was the image plane. We verified (see Results and discussion section) that the overall spine density and the density of each spine type did not vary with distance from the soma so that we could assume spine density at the ends of the dendrites which were truncated was similar to the dendritic length which was observed. We recognize that we cannot see the proximity of labeled cells to other neurons which haven’t taken up the GFP labeling. These unlabeled neighboring neurons may cause some difference in spine distributions which we cannot quantify. For this reason we have attempted to quantify our biological findings statistically over entire experiments and DIV time points instead of by individual neurons, although in certain cases showing results from individual randomly sampled neurons was necessary.
Neuronal reconstruction
There exist many automated methods for studying neuronal growth and morphometry and therefore we present a brief review of available software for tracing dendrites and detecting and classifying spines. In particular, NeuronJ [19] is a widely used software; however it is only semi automatic and one must click several points to trace each neurite. The labeling is done manually and the statistics output only include lengths of neurites and not spine data. HCAVision [20] is a costly software with similar goals, however the parameters of the neurite tracing are set manually with a sliding bar and thus results require much handtuning. In addition, it is also focused on tracing neurites as opposed to spine analysis. For a full review of existing methods and softwares for neuron tracing and spine detection see [21]. We found NeuronStudio [2224] to be the most userfriendly, and for this reason we used it to annotate dendrites and spines for this analysis.
Despite the abundance of automated softwares, neuronal reconstructions are still largely performed by hand [25] and this is is especially essential for a study like this one, where the traversed distance of the dendrites and number of spines and their shapes were analyzed in such detail. Using automated reconstruction algorithms on raw data is prone to both false positive and false negative detections of spines, as well as misleading spine shape measurements. In cases where neurites from neighboring neurons enter into an image (e.g. Figure 1 panes B and C), NeuronStudio often incorrectly traces these neurites as belonging to the neuron of interest. For this reason we manually traced each dendritic branch and soma of each neuron, ran NeuronStudio’s automated spine detection/classification algorithm and then manually inspected and verified each spine’s location and type. The verification and tracing were done by the primary author and an undergraduate biology student working in the Kosik Lab (see Acknowledgements). They were both familiar with dendrite and spine morphology and the resulting annotations from each were crosschecked by the other.
Relevant spine attributes output from the NeuronStudio software include branch order (BO), type (stubby, mushroom or thin), distance to soma along dendrite (SD), length (tip of spine to dendrite) and width at widest point (head diameter or HD). However since NeuronStudio uses the length and width of the spines to determine the spine type, we chose to make use of spine type and discard the other 2 measurements. NeuronStudio uses centrifugal labeling for branch orders, meaning it starts at 1 at the cell body and moves outwards, incrementing at every yshaped bifurcation regardless of the diameter of the daughter branches. Note that the entire image stack with zdimension information was loaded into NeuronStudio for the spine classification, and that the software has interpolation algorithms to estimate the spine type in 3D. For spine detection the default cutoffs were used, i.e. a required spine height between 0.23 μm, a maximum spine width of 3 μm, a minimum stubby size of 10 voxels (at the imaging resolution given above), a minimum nonstubby size of 5 voxels, and automatic zsmear compensation. For spine classification, the default settings were also used, i.e. a headtoneck ratio threshold of 1.1 μm, an aspect ratio (spine heighttowidth) threshold of 2.5 μm and a minimum mushroom head size of 0.35 μm. NeuronStudio delineates spine types by these 3 thresholds. It is generally known that mushroom spines have a large head and a narrow neck, thin spines have a small head and a narrow neck, and stubby spines display no obvious subdivision in head and neck. If the headtoneck ratio is above the threshold and the minimum mushroom head size is met, the spine is considered mushroom. If both the headtoneck and aspect ratios are lower than the respective thresholds then the spine is considered stubby. The remaining cases result in thin spines. For further information on NeuronStudio reconstruction, detection, and spine classification algorithms please refer to [22, 23]. In addition to the spine information, a trace file is output which labels the cell body, branch points and end points of the dendrites. The trace provides a skeletonization, or centerline, of the dendrite which we used to compute the linear network distances in the following analyses.
Loglinear model as a tool for exploring important features and their dependencies
To find the most influential attributes with regard to prediction and spatiotemporal modeling of spines we fit a loglinear model to the feature data, which is a type of generalized linear model [26]. The cooccurrence frequencies of the features in question are essentially a large multidimensional contingency table of counts. The standard linear models assume that data is normally distributed around a certain mean, which means that the observations can take any real value, positive, negative, integer or fractional. Loglinear models, on the other hand, assume that data is intrinsically nonnegative, typically counts that could be Poisson distributed, and allow us to model the association and interaction patterns among categorical variables. The attributes under consideration are BO, Type, SD and DIV. Again, since the type of spine was quite directly dependent on the length and the head diameter of the spine, we left these latter variables out of the modeling.
In order to analyze the data using a loglinear model, the various features must be in a categorical form or discretized. In an exploratory analysis such as this, one does not know what dependencies among features to expect; however we would like to note that these dependencies were not lost in the discretization process since trends in increasing and decreasing feature values would be preserved. To ensure that there were a reasonable number of observations at the higher branch orders, we pooled BO values of 5 or higher into a single category called "higherorder branches". We created a categorical variable to represent the continuous variable soma distance (SD) where categories were determined using the 4 quartiles of the SD spine data pooled over all 3 experiments. Specifically, SD values of less than 65.65 μm were classified into the first group, from this value to less than 108.99 μm the second, from this to less than 157.04 μm the third, and the rest (less than the most distal spine which lay at 413.25 μm from the cell body) fell in the fourth group. Binning the observed data for the continuous variables is the best way to get a general feel for how these quantities relate to each other. After this postprocessing of the data we arrived at 5 categories of branch order, 4 categories of soma distance, 3 spine types (mushroom, stubby, and thin), and 3 DIVs (7, 14, and 21 days).
Using the observed frequencies for the aforementioned attributes, we created a fourway contingency table and fit the model using the 'glm’ function in the R package 'stats’. The table of the frequency of occurrences of the four attributes was modeled as Poisson with each entry being a simple count of the cooccurrences of that bin. We called this count f_{ ijkl } with each of the subscripts i,j,k,l corresponding to a different attribute. The method uses the link function y_{ ijkl } = l o g(f_{ ijkl }), and treats the model as a regular linear model. Each entry y_{ ijkl } is modeled by a combination of coefficients: the intercept, plus main effects, plus every combination of interactions between these four attributes, as shown below.
We estimated this full interaction model using the leastsquares maximumlikelihood approach. We also used a stepwise fit algorithm, which begins with a model that includes only the constant term, and at each step chooses whether or not to add one additional term. The algorithm begins with the main effects then tries each possible 2way interaction, aiming to minimize the Akaike Information Criterion (AIC). The AIC is defined as
where k is the number of parameters i.e the total number of coefficients being estimated, and
is the maximized value of the likelihood function for the estimated Poisson model. In the above equations $\mathbf{x}={x}_{1},\dots ,{x}_{N}\in {\mathbb{R}}^{4}$ are the input vectors, θ = θ_{1}…θ_{ k } are the parameter values (one per term in eqn. 1), and $\mathbf{y}={y}_{1},\dots ,{y}_{N}\in \mathbb{R}$ is the output. The AIC is a commonly used goodnessoffit measure for a model given the observed data. Adding or subtracting terms, whether they be main effects, pairwise interactions, or up to 3way interactions between attributes, will change the AIC value for the model. A lower AIC criterion indicates a better fit to the data and therefore a better model. To compute the stepwise fit we used the R function 'step’. For more information on the stepwise fit algorithm as well as the AIC criterion we ask that the readers refer to the 'step’ function reference ([27], Chapter 6). We ran both of these LLM fitting procedures for all 3 experiments separately expecting to find general agreement between coefficients of the corresponding models created.
Multinomial logistic regression to predict spine type from neighbor types
In order to predict spine type we first determined which attributes contributed most to spine type prediction. Given the complexity of the multidimensional LLM and the various interactions and conditional frequencies that would impinge on this issue, we decided to determine these attributes by analyzing 2way contingency tables for spine type vs. SD, BO, DIV, as well as the spine types of the 3 nearest neighbors. This analysis helped us pick attributes that would be useful as the predictors in the multinomial logistic regression (MLR) [28] explained below.
When the response variable of a regression takes binary values "Logistic Regression" is used. This is an approach which uses a linear combination of the predictor variables to predict the logodds of a success (the "logit" of the probability). Since our response variable was spine type and it can take 3 values (mushroom, stubby or thin), we needed to use a "Multinomial Logistic Regression" (MLR) which attempts to model the probability of any of multiple possible outcomes. We did not use the attributes SD or BO as predictors variables since the results of both the LLM analysis and 2way contingency tables mentioned above told us that these quantities were not as relevant for spine type prediction. Therefore our model consisted of spine type as the output variable and the DIV, 1st, 2nd and 3rd nearest neighbor type along the dendrite as the predictor variables. We tried using only 1 or 2 nearest neighbors, however the results proved inconclusive because the prediction probabilities for each of the 3 types were predominantly close to 1/3. If we used more than the 3 nearest neighbors we sometimes ended up spanning a segment of dendrite which we did not consider to be "local", so we decided that 3 nearest neighbors provide the most useful information in the case of this study.
The MLR analysis we performed in this paper does disregard the actual interspine distances, meaning that if the 3 nearest neighbors are very close or very far apart we still treat them the same. We did this partially because adding the distance variables would complicate the model significantly, but also because we believe that over a large population of spines such as the one we have, these differences in distance will average out and we will still get a general picture of the trends between neighboring spine types. To verify that this was true we computed a histogram showing the distribution of 3rd nearest neighbor distances for each spine, shown in Figure 2. Although the maximum distance to any 3rd nearest neighbor is extremely high (248.31 μm) we can see from the histogram as well as the fact that the median 3rd nearest neighbor distance was 5.34 μm that this distance is clearly an outlier case and that the majority of 3rd nearest neighbor distances lie below 25 μm.
Suppose the output variable categories are denoted by 0,1,2 corresponding to mushroom, stubby or thin spines, with 0 being the reference category. If y_{ i } denotes the observed outcome of the output variable (spine type), and X_{ i } is the corresponding vector of the 3 neighbor types and DIV for the ith observation, one regression is run for the logit probability of each category k, with β_{ k } representing the vector of regression coefficients in the kth regression (eqns. 4,5). This is done for all but the reference category, whose probability is then obtained by subtracting all other probabilities from one (eqn. 6). Note that because the predictor variables were spine types, which were nominal as opposed to ordinal variables, the predictor variables X_{ i } must be represented with a "dummy coding". This means each neighbor type was represented by 2 predictor variables, where (1,0) corresponded to mushroom type, (0,1) corresponded to stubby type and (0,0) corresponded to thin type. This does not need to be done for the output variable y. With the addition of the DIV, which does not have to be dummy coded since it is an ordinal variable, this made each X_{ i } vector of length 7.
The regressions are then written as:
and
The parameters are estimated typically by using an iterative procedure such as "iteratively reweighted least squares" (IRLS) or, more commonly by a numerical approach (a quasiNewton method) such as the BroydenFletcherGoldfarbShanno (BFGS) method. In our case we create an MLR using the command multinom in the R package nnet [29] which uses BFGS by calling the R function optim. It can be seen that
so that the beta coefficients represent the change in the log odds of the dependent variable being in a particular category with respect to the reference category i.e. the thin type, for a unit change of the corresponding independent variable. To check if the models created from all three experiments were in agreement, we ran the MLR separately for each experiment.
To satisfy one of the major assumptions of this analysis, namely that the data must be a set of independent observations, we took 200 randomly sampled spines of each type from each experiment (600 spines per experiment total) to use for the parameter estimation. We chose to select equal proportions of each spine type in order to remove any bias in the model towards the less frequent thin spines, and 200 was the largest number we could justify using since there were only 649 thin spines in experiment 3. We verified that these randomly sampled spines did not lie within 10 μm of the image border so that we were fairly certain their nearest neighbors did not fall outside of the image plane. Note that due to the tortuosity of the dendritic structure this did not mean that our sample was necessarily biased towards spines which were proximal to the soma. We did not verify explicitly that the sampled spines were not neighbors of each other, since we assumed that the variation captured by the random sampling was enough to ensure some level of independence. The idea was to aim for an independent set of observations which represented the entire "population" of spines in that experiment. To be clear we used all 30,285 spines for the LLM model and Kfunction analysis, only the MLR model required random sampling since we were using neighbor information which would have been redundant if we considered every spine.
To verify that the prediction of spine type provided by the MLR was better than what we would get purely by their relative abundance i.e. without neighboring spine type information, we computed something similar to a "Bayes Factor" [30]. Bayes factor is a method of choosing between two models on the basis of the observed data. In our case, the first prediction model was simply the prior global probability of finding a given spine type based on its frequency in the particular experiment under consideration. The second model was the MLR prediction model using the neighbor type information. We computed P(Y = iX)/P(Y = i) and reasoned that values considerably larger than one indicated the neighboring spine type information was helpful in the prediction of the central spine type.
Linear network Kfunction as a tool for testing spatial point patterns
Originally proposed by Ripley in 1981 [31], the purpose of the Kfunction is to estimate whether or not there is clustering or repulsion present in a given spatial point process. The common null hypothesis is that the points within the observation window are distributed as a homogeneous Poisson process, which is also termed "completely spatially random" or CSR. This means that the density of points does not vary depending on the spatial parameters i.e. x and y in the 2D Euclidean case, or the location along the dendritic network in our case. In order to determine if this is a valid null hypothesis for our data, we created QQ plots [32] for individual dendrites which compared the quantiles of the SD values of observed spines to the theoretical quantiles for the CSR case. If the two distributions (observed and CSR) being compared were similar, the points in the QQ plot would approximately lie on the line y = x. In order to create the theoretical quantiles it is necessary to know the values of SD at any location on the given network, not just at the spine locations. Once we have this we can partition the network into epsilon small segments and assign each segment a value 1 if it contains a spine and 0 otherwise based on the CSR assumptions. We did this using code provided to us by Adrian Baddeley and Gopal Nair at the Commonwealth Scientific and Industrial Research Organization (CSIRO), Australia.
The Kfunction computes the expected number of points within a distance t of an arbitrary point p, therefore the empirical value in 2D Euclidean space for the CSR case will be proportional to the circular area, λ π t^{2}. The proportionality constant λ represents the density of points in the homogeneous Poisson case, and can be estimated by finding the total number of points N divided by the total area of the observation window A. Ripley’s Kfunction, which is a function of t, is a very useful tool because it describes the 2^{nd} order characteristics of the point process at several scales t. If we ignore the edge effects due to the observation window, the observed $\hat{K}(t)$ can be written as:
where I stands for the indicator function, and d_{ ij } stands for the Euclidean distance between two points p_{ i } and p_{ j }. In the above equation, we see that the expectation is normalized by 1/λ since $\lambda =\frac{N}{A}$, so we infer that theoretically K(t) = π t^{2} implies spatial independence of points, or a CSR point process. Therefore, if K(t) is the theoretical CSR value of the function and $\hat{K}(t)$ is the observed function, then $\hat{K}(t)>K(t)$ implies clustering between points and $\hat{K}(t)<K(t)$ implies repulsion. It is possible to extend this function to multitype point patterns (i.e. to find clustering or repulsion between specific spine types) or to higher dimensional data (i.e. spacetime, or 3D Euclidean space).
Since our particular point process consists of spines which lie along the "linear network" of the dendritic tree we were primarily concerned with interspine distances along the dendrite as opposed to in Euclidean space. Therefore we used a version of the Kfunction developed recently for linear networks by Okabe and Yamada [15]. This modified version of the Kfunction takes into account the structure of the linear network on which the point process resides and imitates the Euclidean space Kfunction described above. The linear network Kfunction is calculated as follows:
where ℓ_{ T } is the length of the total network L_{ T }. The theoretical CSR for this case is described as follows:
where p is a point belonging to the set of all points P = {p_{1},…,p_{ N }}, and ℓ_{ p }(t) is the length of the subset of the network L_{ p }(t) where the distance between p and any other point is ≤ t. Note that here the distance d_{ ij } stands for the linear network distance along the dendrite. Accounting for variability in the length ℓ_{ p }(t) means the formula takes into account the edge effects due to the observation window (in our case the image plane) inherently, but at the cost of added complexity. The computation of the theoretical linear network Kfunction requires us to find ${L}_{{p}_{i}}(t)$, the subset of L_{ T } where the network distance between a specific point p_{ i } and any other point is ≤ t, and ${\ell}_{{p}_{i}}(t)$, the length of that subset, for every point p_{ i }. A visualization of the quantities d_{ ij }, L_{ T }, ℓ_{ T }, ${L}_{{p}_{i}}(t)$, and ${\ell}_{{p}_{i}}(t)$ is shown in Figure 3.
Note that although many biological applications of point processes treat individual observations as replicate patterns coming from the same underlying distribution, we cannot do that using the above definition of the network linear Kfunction due to the change in linear network structure from dendrite to dendrite. The term "dendrite" here refers to the entire dendritic tree resulting from a single root branch of a neuron. Other invivo studies [33, 34] focus on clustering of spines which lie on the same unbranched section of the dendrite, however we focus on the entire dendritic tree under the hypothesis that it follows rulebased distributions of spines due to anatomical constraints and integration of the a signal over the entire dendrite. One can infer from Figure 3 that since the geometry of the linear network changes from dendrite to dendrite, so do the total lengths of the networks ℓ_{ T }, the ranges of possible tvalues and the amount of dendritic length that is present within a given distance of any point. We did not simply normalize the lengths of the networks to a [ 0,1] scale because it is desirable for the taxis to retain its real physical values in order to make conclusions about the scale (in μm) of clustering or repulsion among spines. However, we did desire to compare the linear network Kfunctions of various dendrites in a meaningful way. For this reason we used a corrected version of the network Kfunction that intrinsically compensates for the geometry of the network called Ang’s correction [35]. The observed Kfunction then becomes:
where m(i,d_{ ij }) is the number of points of L lying at the exact distance t away from the point i measured by the shortest path. That is, the contribution to the function from each pair of points (i,j) is weighted by the reciprocal of the number of points that are situated at the same distance from i as j is. As a result, the theoretical CSR case is simply K(t) = t for all 0 ≤ t < T. This enables direct comparison of tvalues across dendrites, as we will see in the results section.
Simulations and qvalues
To test the null hypothesis that the locations of spines on the dendrites were indeed CSR, we created a summary statistic which encompasses the difference between the empirical $\hat{K}(t)$ and the theoretical K(t) under CSR. The summary or "test statistic" we used, is the max absolute difference (MAD) over t, viz.
One method for obtaining a distribution of d proposed by Diggle [36] is to bootstrap the residuals, or differences between the observed and theoretical values. However a more heuristic and intuitive way is to simulate the CSR case for each dendrite, compute the Kfunction for each of these simulations, and find the simulated distribution of our test statistic. We then found the pvalue of the observed difference d from this simulated distribution.
Specifically, we carried out 1000 CSR simulations for each dendrite by placing uniform points on a line [0,ℓ_{ T }], and mapping them to that specific dendrite’s linear network structure. The number of points simulated per dendrite equaled the number of observed spines for that dendrite, thus preserving the overall density λ. This means the same number of spines that existed on each dendrite were randomly placed along the linear network specific to that dendrite. We used these simulations to obtain 1000 values of the summary statistic, say d[i]. Then the pvalue for each dendrite was simply the proportion of simulated values that fell above the observed or experimental value of d, i.e. the rank of this d within the 1000 values of d[i], or nrank/(nsim + 1).
This pvalue approach is similar to the test which rejects the null hypothesis if the graph of the observed Kfunction lies outside the "pointwise simulation envelope" at any value of t. A simulation envelope is essentially a graphical measure of how far a function can deviate from the theoretical value without being considered significant at a given level. As mentioned above in our case the envelope is calculated by first creating the 1000 CSR simulations of a point pattern on a given dendritic network with the same observed network intensity, then calculating the linear Kfunction for each of these 1000 simulations. To perform a twosided significance test at the 10% level, the 5% and 95% percentiles are then calculated based off the 50 lowest and 50 highest linear Kfunction values per tvalue, hence the term "pointwise". Plotting these values as a function of t gives one a visual idea of the spread that is produced by chance mechanisms alone. If the observed Kfunction for a given tvalue does not fall outside these percentiles, it is considered insignificant for that tvalue at the 10% significance level. We make use of the R package 'spatstat’ [18] for obtaining the pointwise simulation envelope.
Because we have a multitude of hypothesis tests and pvalues (one for each dendrite), to reach a conclusion about the general trend for each DIV and experiment, we used the concept of False Discovery Rate (FDR) [37]. The FDR is defined as
Controlling the overall FDR, or expected proportion of incorrectly rejected null hypotheses termed "false discoveries", is a statistical method commonly used in multiple hypothesis testing which increases the statistical power of each test. What is more general and useful however, is a testspecific FDR measure. This essentially allows us to look at all possible significance thresholds at once, as well as provide each test with a measure of significance that can be easily interpreted. This is accomplished by calculating an analogue of the pvalue for each test called a "qvalue" [38]. A pvalue of 0.05 implies that 5% of all tests will result in false positives, whereas a qvalue of 0.05 implies that 5% of significant tests will result in false positives. Since the latter is clearly a far smaller quantity, qvalues generally indicate fewer significant tests than pvalues for a given significance threshold and provide a far more accurate indication of the level of false positives in the case of multiple hypothesis testing. For qvalue estimation we used the qvalue package available from [39].
Results and discussion
Data analyzed
We performed three biological replicate experiments resulting in a total of 75 neurons from the following time points: DIV 7, DIV 14, and DIV 21 (Table 1). This provided a rich and complete data set resulting in 485 dendritic branches and 30,285 spines. Example images from each DIV along with zoomed in dendritic segments where spines and annotations are visible are shown in Figure 1. Scale bars are shown in red in panels AC and the yellow rectangular boxes in panels AC show the region of interest which has been zoomed in on in panels DF respectively. Panels DF are all at the same resolution.
The number of spines per μ m, or λ, for each dendrite in different experiments and time points is shown in Figure 4. We chose to include this in order to help the reader compare these neuronal culture results with other experimental paradigms with which they may be more familiar. It is clear from the histograms that the distribution of spine density for DIV7 is skewed toward lower values as compared to the density for DIV21, as expected. The image data as well as spine and trace annotations are made publicly available through the BISQUE system [40] and the URL is given in the section titled "Availability of supporting data". We chose BISQUE over other databases like NeuroMorpho.Org [41] because it allows us to upload multiple layers of annotations as opposed to only the digital reconstruction files.
We calculated a 2way contingency table over all experiments and spine types and obtained Table 2. From this table we note the high frequency of mushroom and stubby spines as compared to thin spines, and also the fact that the ratio of types does not remain the same per experiment even though they were indeed biological replicates. In fact, a Pearson’s ChiSquared test on Table 2 shows dependence between the spine type counts and experiment number, χ^{2}(d f = 4,N = 30285) = 659.87, p < 0.0001.
We believe that the large experimental variation between spine type proportions and counts in each experiment was a positive result because this meant that statistical agreement across all 3 experiments relating to spine type clustering and density estimation carries heavier weight than if the 3 experiments were more uniform in these quantities, or if we had pooled data from all 3 experiments together. Also, if all 3 experiments were unusually homogeneous there could be a possibility that it is a result of our specific culturing, imaging or spine extraction methods rather than a true representation of the underlying biological process. The various biological systems to which these techniques will be applied will certainly have this type of variability.
Spine type is independent of distance from soma
As described in the Methods section, we calculated a stepwisefit of the loglinear model starting with just a constant term, and at each step choosing to add the main effects (div, type, bo and sd) and possible 2way interactions between main effects onebyone if they decreased the corresponding AIC value. The captions above Tables 3, 4 and 5 show the final models arrived at for each of the 3 experiments as well as their corresponding AIC values. The tables indicate the change in the AIC value that would occur from adding or omitting each of the terms in the first column. This gives us an idea of how important that term was to the model. The rows of the table are ordered by their overall contribution to the model, i.e. the term in the first column of the first row of each table had the lowest AIC value and was therefore the most important to the overall model. If the reader requires further information on the AIC criterion or how to interpret this table we ask them to refer to Chapter 6 of [27].
Despite the fact that they were included in the final stepwise fit model for experiments 1 and 3, the AIC values in Tables 3, 4 and 5 show that in all 3 experiments the interaction between spine type and soma distance ("type ·sd") as well as spine type and branch order ("type ·bo") were the least important in modeling the overall frequency table of occurrences. This implies that the correlation between these quantities was not very high, therefore we reason that it was not necessary to use either SD or BO to predict the spine type in the MLR created in the following section. We also noticed that the term marking the interaction between BO and SD was the most important pairwise term in all stepwise fit models. It is expected that BO and SD are correlated because both necessarily increase as we move away from the cell body. Indeed, running a 2way Chisquare test on the contingency table of the discretized versions of these variables showed us high dependence, χ^{2}(d f = 12, N = 30285) = 11635.19,p<0.0001. We also saw a high level of dependence between DIV and SD (χ^{2}(d f = 6, N = 30285) = 681.76, p < 0.0001) and between DIV and BO (χ^{2}(d f = 8, N = 30285) = 1604.75, p < 0.0001). This was intuitive as well since we expect both BO and SD to generally increase with DIV.
It is possible that the Type vs. SD relationship could have also been estimated using a Sholltype analysis ([42]) where we count the number of each type occurring within concentric circles from the soma and verify that it is constant, however this would not necessarily produce the same results. The crucial difference between our approach and the Sholl approach is that in our approach the "distance from soma measures" the actual distance along the centerline of the dendrite instead of the radial distance from the cell center. This is especially important for dendrites with high tortuosity (which we find prevalent in our data), since the radial distance in those cases will not correspond to the dendritic distance from the cell body. Many studies of cultured neurons use Sholl analysis, however they use it in its original form for counting dendritic intersections and do not comment on the relation to spine density or type. To our knowledge this is the first study to quantify the spine density vs. distance to the soma in dissociated neuronal cultures.
Threeway and 4way interactions are generally known to be weak (not as explanatory as the main effects and 2nd order interactions) and difficult to interpret, however in the interest of exploring all possibilities we computed the maximum likelihood fit using all 4 attributes as well as a stepwise fit model which allows for 3way interactions between attributes. The table presented in Additional file 1 results from the LLM which models all possible interactions of all 4 attributes, i.e. up to the fourth order. The coefficients presented in the table are those which were significant at the 0.1% level, and the corresponding pvalues are shown in the last column. The table contains the interactions which were more important to the model, and shows that of these interactions only one (highlighted in green) between type and either BO or SD, was shown as being significant over all experiments. This verifies once again that neither BO nor SD were highly correlated with the spine type. In addition to this, the stepwise fit models in Additional file 2 show that if we did allow 3rd order interactions, the strongest 3rd order correlation over all experiments was that of DIV, SD and BO, again affirming that all 3 of these quantities should intuitively increase together.
Spines tend to cluster with other spines of the same type
In creating a regression model, we first ascertain that the predictor variables used are not only useful in predicting the output variable, but also that they do not provide redundant information as this can throw off the model fitting process. Using all spines in the dataset, we performed a Chisquare test on the 2way contingency tables of spine type versus binned SD and BO, DIV, and the types of the 3 nearest neighbors (N1, N2, N3) as described in the LogLinear Model section above. Due to the aforementioned dependence between the type and experiment number we performed the test separately for each experiment and the results are shown in Table 6. From the table we can see that the DIV and the 3 nearest neighbors showed clear dependency with spine type in all experiments, whereas SD and BO showed independence at the 5% significance level in experiments 1 and 2 respectively. Since we expected SD and BO to have a similar relationship with type due to the high correlation mentioned above, and we had found this was not a very strong relationship, we chose to use only DIV, N1, N2 and N3 as predictors for spine type in the MLR model.
The resulting beta coefficients for each of the predictor variables are shown in Table 7. Here "N1Var1" refers to the beta coefficent of the first dummy variable for the type of the first nearest neighbor; "N1Var2" refers to the second dummy variable, and so on. The "mushroom" row is omitted because it is the reference category and its probability is obtained as shown in eqn. 6. We computed the prediction probabilities for each spine type given each combination of neighbor types for each experiment separately to determine the agreement between experiments. A selected set of results are shown below in Tables 8, 9 and 10. The highest probability for each row is marked by an asterisk. Note that in these tables all DIVs in all experiments predicted the spine type to be mushroom when its 3 nearest neighbors were mushroom type, and stubby when the 3 nearest neighbors were stubby type. Thin types were the most probable when the three nearest neighbors were thin type in all but experiment 2 DIV14 and DIV21. The probabilities for cases where all 3 of the nearest neighbors were not of the same type have been omitted for brevity and because they did not show any clear trends.
The Bayes factor results in Table 11 show that the proportional gain in information for the spine type in question was always greater than one for the prediction of a particular type when the neighborhood types were all of that same type. Due to the low frequency of thin spines, their corresponding Bayes factors were higher than that of other types, meaning that their prediction probabilities benefit more than other types from neighborhood type information.
Dendritic spine densities are completely spatially random
We created QQ plots as described above based on the quantiles of spine counts vs. distance from the soma and found that upon visual inspection almost all dendrites follow the theoretical uniform distribution closely enough to assume that the density of the spines was homogeneous and therefore the CSR case was a viable null hypothesis. We selected 9 (out of 485) example dendrites and their QQ plots are shown in Figure 5. We randomly selected 1 dendrite from each DIV and each biological replicate (experiment) to ensure the diversity of the set. The y = x line is marked in red, and the observed QQ values are marked as black circles. Note that because this is a graphical method for comparing two probability distributions there was no pvalue or significance level associated.
Of all the 485 dendrites analyzed, only three of them (Exp. 1 DIV 21, Exp. 2 DIV 14, and Exp. 2 DIV 21) were considered nonCSR at the 5% significance level. Figure 6 shows histograms of the pvalues of all 485 dendrites separated into each DIV and experiment number. The 5% significance level is shown by the red vertical line in each case. We then computed the qvalues for each dendrite and found that they are all equal to 1. This is not surprising according to the explanation of the qvalue above. Recall that qvalues equal to 1 imply that 100% of the significant tests resulted in false positives, i.e. there were no significant tests. We therefore conclude that regardless of the maturity of the neuron, or the variation over biological replicate experiments, the locations of spines along all of the dendrites we analyzed were completely spatially random.
As mentioned above, the Kfunction is a function of the interpoint distance, t, that we consider around each observed point. The range of tvalues is determined by the total length of the network ℓ_{ T }, therefore because each dendrite has a different network length it also has a different range of tvalues. Our chosen summary statistic throws away this information by computing the maximum absolute deviation (MAD) over all t in order to determine whether that value deviates significantly from the spatially random case. However it may be of interest to determine whether clustering or repulsion between spines occured at specific interpoint distances t. Ang’s correction normalizes the Kfunction such that the theoretical K(t) = t for all t, so we can easily use this as a reference point. Figure 7 shows the Kfunction for the same 9 example dendrites used for the QQ plots of Figure 5. Each graph shows the observed $\hat{K}(t)$ function (black), the theoretical K(t) function (red) as well as the twosided 5% and 95% pointwise simulation envelopes as a function of the radius t. Following the description of the pointwise simulation envelope above we calculated these lower and upper envelopes at the 5% and 95% percentiles per tvalue in the interest of checking if any tvalue fell outside of this range. Since the black curves do not leave the gray shaded area for any value of t, the deviation from spatially random was insignificant at the 10% level for every tvalue and is in agreement with our previous conclusion using the MAD statistic. This observation holds for almost all of the 485 dendrites we inspected visually, with no specific tvalue evidencing either repulsion or clustering.
Conclusions
The models used in this work allow spatial prediction of spine types, which has not previously been studied. The conclusions presented here relate to qualities of neurons in dissociated culture. We acknowledge that some of these results will most likely not hold for in vivo settings due to neuronal interactions not modeled here, but maintain that the statistical methods used here will be useful and easily applicable. Specifically, we found here that spine type and density are not dependent on the distance from the cell body, and these observations are likely to change for in vitro slices or microinjection of fixed brain tissue.
We also note that we chose not to deconvolve our data because of its high contrast. We acknowledge that this choice may have precluded the image analysis software from detecting some stubby spines among the halo of the bright dendrites, but we do not feel this significantly impacted our results. As a partial compensation for this effect we used NeuronStudio’s inbuilt automatic zsmear compensation, and for more details on this we refer the reader to [22, 23].
Although in this study the spine distributions seemed to be completely spatially random it is possible that we will find studies using different neuronal types and treatments where this is not true. In these cases, where spine density may vary with distance from the cell body, it would be interesting to test for inhomogeneous patterns of points such as the hard core Strauss Process used in [43]. We could also place an exponentially decaying function to model the interaction between spine types within a certain radius or experiment with other pairwise interaction functions such as those used by Diggle, Gates and Stibbard [44] or Diggle and Gratton in [45].
We find it an interesting result that spines were not spatially clustered when type was disregarded, as shown by the linear network Kfunction analysis, however spine types do tend to group together as shown by the MLR analysis. We would like to note that these results are not contradictory because they are in fact measuring different quantities. The MLR results tells us that, regardless of their densities along the dendrite, if we have a spine which is of a given type, its 3 nearest neighbors are likely to be of the same type. The Kfunction, on the other hand, tells us that regardless of type the spines’ locations along the dendritic network are spatially random. These two results provide complementary information and together could aid us in future modeling tasks such as simulation of neuronal growth. For example, we could first place spines uniformly along the dendritic network, and then decide the types of those spines based on the type of information given by the MLR model. As future work we plan to analyze the network cross Kfunction [15] of the dendritic network, which models the spine distribution as a multitype point process and therefore provides information about repulsion and clustering of each spine type with each other spine type, modeling both density and type simultaneously.
Generally previous studies such as [4649] have relied on physiology or biochemical markers to validate their neuronal properties. The quantitative morphological features described here provide an additional phenotypic dimension for these analyses. Likewise these approaches can be applied to phenotypic analyses of neuronal cultures following overexpression or suppression of specific genes to capture their effect on a complex phenotype. As mentioned in the Introduction section, the only other study we are aware of which analyzes clustering of dendritic spines in monkey brains is [14]. The authors of this work study the number of "clustered spines" on each dendritic segment, where a cluster is defined as a group of 3 or more spines. The method used here defines clustering as a statistically significant positive deviation in the linear Kfunction from the theoretical value of the spatially random linear Kfunction. We believe our method to be more principled and our results easier to interpret than those of [14] due to the more formal statistical definition of clustering.
We chose to use dissociated hippocampal cultures because they are widely used and they allow us to perform an indepth and automated analysis with larger spine populations than most previous studies. These approaches will be important in assessing features of neurons derived from human induced pluripotent stem cells which have so far not been characterized by detailed morphological features. Our paper utilized a highly simplified neuronal culture system to develop the statistical and computational tools for more advanced in vivo studies needed to address the aforementioned bigger biological questions. Our overall hypothesis was that we can utilize imaging and statistical analyses to capture features of spine distributions that can be used for testing hypotheses in invivo settings. Indeed, we have been conservative about hypotheses and findings concerning spine type clustering because any conclusions we might reach on the specifics of spine distribution would be limited to the neuronal culture system we studied.
Availability of supporting data
All the image stacks and NeuronStudio annotation files supporting the results of this article are available in the BISQUE repository, http://bisque.ece.ucsb.edu/client_service/view?resource=http://bisque.ece.ucsb.edu/data_service/dataset/2653471.
Abbreviations
 DIV:

Day in vitro
 LLM:

Loglinear model
 SD:

Distance from the soma to the spine along the centerline of the dendrite
 BO:

Branch order of the dendritic segment on which a spine lies
 N1, N2, N3:

The spine types of the 1st, 2nd, and 3rd nearest neighbors along the dendrite, respectively
 MLR:

Multinomial logistic regression
 CSR:

Completely spatially random
 FDR:

False discovery rate
 MAD:

Maximum absolute deviation.
References
 1.
Irwin S, Patel B, Idupulapati M, Harris J, Crisostomo R, Larsen B, Kooy F, Willems P, Cras P, Kozlowski P, et al: Abnormal dendritic spine characteristics in the temporal and visual cortices of patients with fragileX syndrome: A quantitative examination. Am J Med Genet. 2001, 98 (2): 161167. 10.1002/10968628(20010115)98:2<161::AIDAJMG1025>3.0.CO;2B.
 2.
Yuste R: Dendritic spines and distributed circuits. Neuron. 2011, 71 (5): 772781. 10.1016/j.neuron.2011.07.024.
 3.
Harnett M, Makara J, Spruston N, Kath W, Magee J: Synaptic amplification by dendritic spines enhances input cooperativity. Nature. 2012, 491: 599602. 10.1038/nature11554.
 4.
Harris K, Kater S: Dendritic spines: cellular specializations imparting both stability and flexibility to synaptic function. Annu Rev Neurosci. 1994, 17: 341371. 10.1146/annurev.ne.17.030194.002013.
 5.
Kaech S, Banker G: Culturing hippocampal neurons. Nature Protoc. 2007, 1 (5): 24062415.
 6.
Mukai J, Dhilla A, Drew L, Stark K, Cao L, MacDermott A, Karayiorgou M, Gogos J: Palmitoylationdependent neurodevelopmental deficits in a mouse model of 22q11 microdeletion. Nat Neurosci. 2008, 11 (11): 13021310. 10.1038/nn.2204.
 7.
Cheetham C, Hammond M, McFarlane R, Finnerty G: Altered sensory experience induces targeted rewiring of local excitatory connections in mature neocortex. J Neurosci. 2008, 28 (37): 92499260. 10.1523/JNEUROSCI.297408.2008.
 8.
Schratt G, Tuebing F, Nigh E, Kane C, Sabatini M, Kiebler M, Greenberg M: A brainspecific microRNA regulates dendritic spine development. Nature. 2006, 439 (7074): 283289. 10.1038/nature04367.
 9.
Mamaghani MJ, Andersson M, Krieger P: Spatial point pattern analysis of neurons using Ripley’s Kfunction in 3D. Front Neuroinformatics. 2010, 4 (0): 110.
 10.
Bell M, Grunwald G: Mixed models for the analysis of replicated spatial point patterns. Biostatistics. 2004, 5 (4): 633648. 10.1093/biostatistics/kxh014.
 11.
Millet L, Collens M, Perry G, Bashir R: Pattern analysis and spatial distribution of neurons in culture. Integr Biol. 2011, 3 (12): 11671178. 10.1039/c1ib00054c.
 12.
Harkness R, Isham V: A bivariate spatial point pattern of ants’ nests. Appl Stat. 1983, 32 (3): 293303. 10.2307/2347952.
 13.
Mencuccini M, MartinezVilalta J, Piñol J, Loepfe L, Burnat M, Alvarez X, Camacho J, Gil D: A quantitative and statistically robust method for the determination of xylem conduit spatial distribution. Am J Bot. 2010, 97 (8): 12471259. 10.3732/ajb.0900289.
 14.
Yadav A, Gao Y, Rodriguez A, Dickstein D, Wearne S, Luebke J, Hof P, Weaver C: Morphologic evidence for spatially clustered spines in apical dendrites of monkey neocortical pyramidal cells. J Comp Neurol. 2012, 520: 28882902. 10.1002/cne.23070.
 15.
Okabe A, Yamada I: The Kfunction method on a network and its computational implementation. Geogr Anal. 2001, 33 (3): 271290.
 16.
Zhao C, Teng E, Summers R, Ming G, Gage F: Distinct morphological stages of dentate granule neuron maturation in the adult mouse hippocampus. J Neurosci. 2006, 26: 311. 10.1523/JNEUROSCI.364805.2006.
 17.
Banker G, Goslin K: Culturing Nerve Cells. 1998, Cambridge, MA USA: MIT press
 18.
Baddeley A, Turner R: Spatstat: an R package for analyzing spatial point patterns. J Stat Softw. 2005, 12 (6): 142. [http://www.jstatsoft.org, ISSN: 15487660],
 19.
Meijering E, Jacob M, Sarria JCF, Steiner P, Hirling H, Unser M: Design and validation of a tool for neurite tracing and analysis in fluorescence microscopy images. Cytom Part A. 2004, 58 (2): 167176.
 20.
Vallotton P, Lagerstrom R, Sun C, Buckley M, Wang D, Silva MD, Tan SS, Gunnersen J: Automated analysis of neurite branching in cultured cortical neurons using HCAvision. Cytom Part A. 2007, 71 (10): 889895.
 21.
Meijering E: Neuron tracing in perspective. Cytom Part A. 2010, 77 (7): 693704.
 22.
Rodriguez A, Ehlenberger D, Dickstein D, Hof P, Wearne S: Automated threedimensional detection and shape classification of dendritic spines from fluorescence microscopy images. PLoS ONE. 2008, 3 (4): e199710.1371/journal.pone.0001997. doi:10.1371/journal.pone.0001997.
 23.
Wearne S, Rodriguez A, Ehlenberger D, Rocher A, Hendersion S, Hof P: New Techniques for imaging, digitization and analysis of threedimensional neural morphology on multiple scales. Neuroscience. 2005, 136: 661680. 10.1016/j.neuroscience.2005.05.053.
 24.
Dumitriu D, Rodriguez A, Morrison J: Highthroughput, detailed, cellspecific neuroanatomy of dendritic spines using microinjection and confocal microscopy. Nat Protoc. 2011, 6 (9): 13911411. 10.1038/nprot.2011.389.
 25.
The DIADEM Scientific Challenge. http://diademchallenge.org/. [Accessed: 30/09/2012].
 26.
McCullagh P, Nelder J: Generalized Linear Models. 1989, London: Chapman & Hall/CRC
 27.
Chambers J, Hastie T, et al: Statistical Models in S. 1992, London: Chapman & Hall
 28.
Hilbe J: Logistic Regression Models. 2009, London: CRC Press
 29.
Venables WN, Ripley BD: Modern Applied Statistics With S, fourth edition. 2002, New York: Springer, http://www.stats.ox.ac.uk/pub/MASS4. [ISBN 0387954570].
 30.
Kass R, Raftery A: Bayes factors. J Am Stat Assoc. 1995, 90 (430): 773795. 10.1080/01621459.1995.10476572.
 31.
Ripley B: Spatial Statistics, Volume 24. 1981, New York, NY, USA: Wiley Online Library
 32.
Wilk M, Gnanadesikan R: Probability plotting methods for the analysis for the analysis of data. Biometrika. 1968, 55: 117.
 33.
Govindarajan A, Israely I, Huang SY, Tonegawa S: The dendritic branch is the preferred integrative unit for protein synthesisdependent LTP. Neuron. 2011, 69: 132146. 10.1016/j.neuron.2010.12.008.
 34.
Harvey CD, Svoboda K: Locally dynamic synaptic learning rules in pyramidal neuron dendrites. Nature. 2007, 450 (7173): 11951200. 10.1038/nature06416.
 35.
Ang Q, Baddeley A, Nair G: Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scand J Stat. 2011, 39: 591617.
 36.
Diggle PJ: Statistical Analysis of Spatial Point Patterns. 2003, New York: Oxford University Press Inc.
 37.
Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological). 1995, 57 (1): 289300.
 38.
Storey J: A direct approach to false discovery rates. J R Stat Soc: Ser B (Statistical Methodology). 2002, 64 (3): 479498. 10.1111/14679868.00346.
 39.
Dabney A, Storey JD, Warnes GR: qvalue: Qvalue Estimation for False Discovery Rate Control. http://bioconductor.org/packages/2.12/bioc/html/qvalue.html [R package version 1.28.0].
 40.
Kvilekval K, Fedorov D, Obara B, Singh A, Manjunath B: Bisque: a platform for bioimage analysis and management. Bioinformatics. 2010, 26 (4): 544552. 10.1093/bioinformatics/btp699. http://vision.ece.ucsb.edu/publications/kvilekval_Bioinformatics_2010.pdf,
 41.
Ascoli G, Donohue D, Halavi M: NeuroMorpho. Org: a central resource for neuronal morphologies. J Neurosci. 2007, 27 (35): 92479251. 10.1523/JNEUROSCI.205507.2007.
 42.
Sholl D: Dendritic organization in the neurons of the visual and motor cortices of the cat. J Anat. 1953, 87 (Pt 4): 387
 43.
Baddeley A, Turner R: Practical maximum pseudolikelihood for spatial point patterns. Aust N Z J Stat. 2000, 42 (3): 283322. 10.1111/1467842X.00128.
 44.
Diggle P, Gates D, Stibbard A: A nonparametric estimator for pairwiseinteraction point processes. Biometrika. 1987, 74 (4): 763770. 10.1093/biomet/74.4.763.
 45.
Diggle P, Gratton R: Monte Carlo methods of inference for implicit statistical models. J R Stat Soc Ser B (Methodological). 1984, 46 (2): 193227.
 46.
Qiang L, Fujita R, Yamashita T, Angulo S, Rhinn H, Rhee D, Doege C, Chau L, Aubry L, Vanti W, et al: Directed conversion of Alzheimer’s disease patient skin fibroblasts into functional neurons. Cell. 2011, 146 (3): 359371. 10.1016/j.cell.2011.07.007.
 47.
Brennand K, Simone A, Jou J, GelboinBurkhart C, Tran N, Sangar S, Li Y, Mu Y, Chen G, Yu D, et al: Modelling schizophrenia using human induced pluripotent stem cells. Nature. 2011, 473 (7346): 221225. 10.1038/nature09915.
 48.
Egawa N, Kitaoka S, Tsukita K, Naitoh M, Takahashi K, Yamamoto T, Adachi F, Kondo T, Okita K, Asaka I, et al: Drug screening for ALS using patientspecific induced pluripotent stem cells. Sci Transl Med. 2012, 4 (145): 145ra104145ra104. 10.1126/scitranslmed.3004052.
 49.
Marchetto M, Gage F: Modeling brain disease in a dish: really?. Cell Stem Cell. 2012, 10 (6): 642645. 10.1016/j.stem.2012.05.008.
Acknowledgements
We would like to acknowledge Sruti Aiyaswamy for her help with the neuron annotation, and the StatLab at UCSB for checking over the statistical analyses included here. This work was supported by the Larry L. Hillblom Foundation, the Tau Consortium and an NSF award III0808772.
Author information
Affiliations
Corresponding authors
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
AJ designed the study, wrote the software to perform the statistical analyses, interpreted the results and drafted the manuscript. SB cultured and collected the data and drafted the section on Cell Imaging. KK conceived of the study and helped to draft the manuscript. BM coordinated the effort between departments and reviewed the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Table of 4way LLM coefficients.
Additional file 1: This table shows the 4way interaction LLM coefficients which are significant at the 0.1% level. Note that only one interaction between type and either branch order or soma distance (highlighted in green) is significant in the entire table. This further proves the result that these interactions are not very important to the overall model of frequencies. (PDF 50 KB)
12859_2013_6161_MOESM2_ESM.pdf
Additional file 2: AIC Stepwise models for 3way LLM. This table shows the results of the AIC stepwise algorithm using an LLM with up to 3way interactions. The models arrived at by this method are shown in the caption above the table. From this table we can see that if we do allow 3rd order interactions, the strongest 3rd order correlation over all experiments is that of DIV, SD and BO, which makes sense because all three of these quantities should intuitively increase together. (PDF 32 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Jammalamadaka, A., Banerjee, S., Manjunath, B.S. et al. Statistical analysis of dendritic spine distributions in rat hippocampal cultures. BMC Bioinformatics 14, 287 (2013). https://doi.org/10.1186/1471210514287
Received:
Accepted:
Published:
Keywords
 Dendritic spines
 Rat hippocampal culture
 Linear network Kfunction
 Morphological modeling
 Spatial statistics
 Point processes
 Neuronal growth