The plugin is written in Java. It uses the open-source Java library Weka [13] for efficiently computing NN distances using kd-trees and for kernel density estimation of probability distributions. The plugin further uses CMA-ES (the Covariance Matrix Adaptation Evolution Strategy) [14] for parameter optimization. This optimizer is less prone to get stuck in local optima than the simplex method used in the original publication [6]. The plugin has been tested with both ImageJ and Fiji on Windows, MacOS X, and Linux. It should run on any platform where Java and ImageJ are available. Running a complete analysis using the present plugin takes between a few seconds and a few minutes, depending on the number of objects present in the image.
The user interface of MosaicIA is shown in Figure 2. Its workflow is explained in the flowchart in Figure 3. The plugin reads either images (2D or 3D), or comma-separated text files containing the coordinates of objects. The user can create or load a mask to restrict the analysis to a certain region of interest, if necessary. If the analysis is based on images, bright points in the images are first detected using the feature-point detector by Sbalzarini and Koumoutsakos [15], which is also available in Java as an ImageJ plugin and processes both 2D and 3D images. Then, the NN distance distribution D between the two point sets is computed and the context q(d) estimated using grid sampling [6]. This means that q(d) is approximated by computing the NN distance from each point on a regular Cartesian lattice to the nearest neighbor in Y. The grid resolution has to be set by the user. Finer grids lead to more accurate results, but require more computer time. Smooth representations of the observed (empirical) NN distance distribution , and of the sampled context q(d) are obtained by kernel density estimation. The estimator kernel widths are set by the user, but the software provides an initial guess calculated with Silverman’s rule [16].
The plugin currently works with point objects only, even though the interaction analysis framework generalizes to extended objects too [6]. In order to work with non-point objects in the present plugin, they can be segmented using other software, e.g. the Region Competition plugin for ImageJ [17], and their centers of mass can be read into the present plugin. Representing extended objects by their centers of mass does not significantly change the result of the interaction analysis, as shown in the PALM example below.
The type of the potential function can be selected from a drop-down menu. Both parametric and non-parametric functions are provided. The plugin then estimates the potential or its parameters from the data by minimizing the ℓ2 difference between the observed NN distance distribution and the one predicted from the interaction model p(d). The results, including the residual fitting error, are then shown along with a plot of , and q(d), as shown as in Figure 2b,c.
The plugin also provides hypothesis tests to check whether the estimated interaction is statistically significant. The test statistic is computed from K Monte Carlo samples of point distributions corresponding to the null hypothesis of “no interaction”. The test statistic from the actually observed distribution is ranked against these K random samples. If it ranks higher than ⌈(1 - α)K⌉-th, the null hypothesis of “no interaction” is rejected on the significance level α[6].
List of parameter inputs to the plugin
The plugin has six parameters that the user can set to control its behavior. These parameters and their typical values are described below.
Parameters for object detection
The following parameters control the image-processing part of the plugin. The algorithm used to detect bright objects (blobs) in the images and extract their location is described in Ref. [15]. See this reference for a more in-depth discussion of these parameters and for illustrations of how they affect object detection.
-
Radius: Approximate radius of the blobs in the images in units of pixels. This value should be slightly larger than the visible object radius, but smaller than the smallest NN object separation.
-
Cutoff: The score cut-off for false-positive rejection. The larger the cutoff, the more conservative the algorithm becomes to only select objects that look alike.
-
Percentile: Determines how bright a blob has to be in order to be considered an object. All local intensity maxima in the given upper percentile of the image intensity histogram are considered candidate objects.
Parameters for computing distance distributions
Once the objects have been detected in both images, or their coordinates have been read from files, the following parameters can be used to control interaction inference:
-
Grid spacing: The grid spacing controls how finely the context q(d) is sampled in units of pixels. It should ideally be less than half of shortest possible interaction that can be detected with the available data. For an image without sub-pixel object detection, 0.5 (pixel) is hence sufficient. In cases where finer resolution is needed, the user can try successively smaller values until the context q(d) does not change any more. Grid sampling the context q(d) is the most time-consuming part of the analysis. Adjusting the grid size hence significantly influences the computational time.
-
Kernel wt(q): This is the weight parameter used by the kernel density estimator to estimate the smooth context p.d.f. q(d) from the grid samples. Since the number of grid points is usually large, a small kernel weight of 0.001 should be sufficient to produce smooth results. This parameter usually does not need to be changed.
-
Kernel wt(p): This is the weight parameter used by the kernel density estimator to estimate the smooth NN distance distribution . The value of this parameter is critical, and a rough estimate for it is computed using Silverman’s rule [16]. The resulting value is shown as a suggestion. This parameter should be carefully tuned so that the resulting distribution contains all relevant information from the histogram, without overfitting it. A larger value for this parameter leads to a more fine-grained, less smooth fit.
List of potentials provided
The plugin provides both parametric and non-parametric potentials that can be used to describe an interaction. The non-parametric potential is more flexible and does not require the user to assume anything about the functional shape of the interaction. However, it requires more computer time to be estimated and does not support statistical tests for the significance of an interaction. Parametric potentials offer an intuitive interpretation of an interaction by its strength and length scale. Frequently, one first estimates a non-parametric potential in order to get an idea of the rough shape of the interaction. Then, one selects the parametric potential most similar to it and repeats the estimation.
Parametric potentials
Potentials are parameterized as ϕ(d) =vε f((d - t) / σ) with interaction strength ε, length scale σ, and a hard core t. For the step potential, σ = 1. For all other potentials, t = 0. The shapes f(·) of the various potentials are:
-
Step potential:
(2)
-
Hernquist potential:
(3)
-
Linear potential, type 1:
(4)
-
Linear potential, type 2:
(5)
-
Plummer potential:
(6)
Plots of these potentials are shown in Figure 1d. Other potentials can easily be implemented, if needed.
Non-parametric potential
The non-parametric potential does not assume any specific shape and can be used to gain an approximate idea of the shape of an unknown interaction. It is defined as a weighted sum of linear kernel functions centered on P support points (defined in the #support pts field). The more support points are used, the finer the potential is resolved, but the more costly and unstable the estimation becomes. The smoothness of the estimated potential is controlled by the smoothness parameter, which penalizes differences between adjacent support points. Larger smoothness parameters lead to smoother potentials, but may miss or average out interesting interactions. Therefore, this parameter should be used with caution.
Working with coordinates instead of images
It is possible to directly use MosaicIA on localization data. This is useful when working with imaging modalities like PALM and STORM that provide point coordinates rather than images. It is also useful when working with objects that are not blob-like, or for which the object-detection step of the plugin does not work well. These objects can be detected and segmented using any other tool, e.g. the Region Competition [17] or the split-Bregman/Squassh [18] plugins for ImageJ/Fiji, and their coordinates stored in a file. A comma-separated text file of object coordinates can be read into MosaicIA by clicking load coordinates instead of load images. Each line in the file should contain the coordinates of one object in the format x, y, (z). The spatial boundaries of the point patterns (they must be identical for both X and Y) are entered in the fields provided. For objects detected in a 400×400 pixel image, the boundaries are (0, 399) in both directions.
Interpreting the results
The estimated interaction potentials and parameters can be used to quantitatively compare spatial distributions across different samples and conditions (e.g., perturbations). Comparisons based on interaction strengths and length scales, however, should only be done for results obtained with the same potential shape.
The strength of an interaction, ε, is equal to zero for independent, i.e. non-interacting, point patterns. However, due to noise and random overlap in the data, the strength may be slightly greater than zero even in the case of no interaction. A hypothesis test is therefore provided in order to check whether an estimated interaction is statistically significant given the amount and quality of the data used to infer it.
The hard core of an interaction, t, is akin to the distance threshold in classical object-based co-localization analysis. If two objects are closer to each other than this hard core, they are considered overlapping.
The length scale of an interaction, σ, quantifies the units of length in which the potential is scaled. It hence provides information about the length scale of organization between the two point patterns. The unit of length is pixels if the objects are detected from images. If coordinates are read from a file, the unit of length is as defined in that file.