- Open Access
SMoLR: visualization and analysis of single-molecule localization microscopy data in R
BMC Bioinformatics volume 20, Article number: 30 (2019)
Single-molecule localization microscopy is a super-resolution microscopy technique that allows for nanoscale determination of the localization and organization of proteins in biological samples. For biological interpretation of the data it is essential to extract quantitative information from the super-resolution data sets. Due to the complexity and size of these data sets flexible and user-friendly software is required.
We developed SMoLR (Single Molecule Localization in R): a flexible framework that enables exploration and analysis of single-molecule localization data within the R programming environment. SMoLR is a package aimed at extracting, visualizing and analyzing quantitative information from localization data obtained by single-molecule microscopy. SMoLR is a platform not only to visualize nanoscale subcellular structures but additionally provides means to obtain statistical information about the distribution and localization of molecules within them. This can be done for individual images or SMoLR can be used to analyze a large set of super-resolution images at once. Additionally, we describe a method using SMoLR for image feature-based particle averaging, resulting in identification of common features among nanoscale structures.
Embedded in the extensive R programming environment, SMoLR allows scientists to study the nanoscale organization of biomolecules in cells by extracting and visualizing quantitative information and hence provides insight in a wide-variety of different biological processes at the single-molecule level.
The revolutionary advancements in super-resolution microscopy techniques make it possible to study subcellular structures at nanoscale, using fluorescence microscopy. Single-molecule localization microscopy (SMLM) provides the highest spatial resolution that can be achieved with light microscopy today, with a lateral resolution between 10 and 20 nm [1, 2]. SMLM relies on detecting single fluorescent emitters, by separating spatially overlapping signals in time. By detecting and determining the position of individual fluorescent molecules, in densely labelled biological samples, with high precision, images can be reconstructed with a resolution an order of magnitude below the diffraction limit of the light microscope.
In many biological samples a multitude of macromolecular assemblies and protein complexes within one cell can be observed, such as DNA double strand break (DSB) foci [3, 4], nuclear pores , focal adhesions , virus particles  or neuronal spines . Super-resolution microscopy is well suited to study those assemblies, since the increased resolution permits to investigate, at the single-molecule level, the internal composition and protein distribution of these nanoscale assemblies, which have typical diameters ranging from 100 nm up to 2 μm.
In contrast to regular microscopy data which consists of intensity values in a digital image format, SMLM data typically consists of Cartesian coordinates with corresponding localization precision. Therefore, regular image analysis tools do not directly apply to SMLM data. Numerous software packages for detection and localization of single-molecules from single-molecule localization data are available (reviewed and benchmarked in ), that allow reliable image reconstruction for SMLM. Additionally tools have been developed which allow more in-depth (3D) visualization of the localization data (PALMsiever , ViSP , PYME ), clustering (SR-Tesseler , 3DClusterVisu ) and extraction of quantitative information (SharpViSu , LAMA  and Grafeo ) (Table 1).
Here, we present a versatile software package named SMoLR (Single Molecule Localization in R), that enables researchers to analyze large sets of single-molecule localization data in a quantitative way. The pointillist nature of the data gives possibilities for alternative types of analysis, for which the resourceful R programming language can be of great value . With SMoLR we complement existing software, with a software package for analyzing larger data sets with localization data at once in the free open-source R environment.
SMLM data consist of Cartesian coordinates of molecules and their respective precision along with all possible extra information that is desired in a specific experiment (i.e. time or frame of detection, channel, estimated number of photons detected etc.). The localization data together with these additional parameters can be imported into SMoLR in different formats obtained by different single-molecule localization software: ThunderSTORM , Zeiss ZEN software, SOSplugin  or plain text (Fig. 1). SMoLR is versatile and can be used in different ways, where one specifically useful way is to define Regions of Interest (ROIs) from the super-resolution images to analyze the organization of proteins in subcellular structures. Subsequently applying a single analysis to each ROI will result in quantitative information describing the distribution of proteins in a large number of structures.
ROIs can be either manually or automatically selected in image analysis software such as ImageJ , the localization data of these ROIs can be imported in SMoLR (Fig. 1). Alternatively, ROIs can also be automatically selected using localization clustering functions in SMoLR. The localization data within the different ROIs is selected and stored in a list with localization data from the different ROIs. These objects can subsequently be analyzed by SMoLR at once, using single commands. To visually inspect the ROI data, we provide an interactive application which shows the ROIs in the full super-resolution image together with several statistical parameters (Additional file 1: Figure S1).
SMLM data can be visualized in many ways. The most frequently used method is to plot Gaussian distributions for all localizations with standard deviations corresponding to the localization precision (Fig. 2a) . However, with this method intensity values do not directly depend on the density of localizations, but also depend on localization precision. As an alternative approach we implemented a 2D-Kernel density estimation (KDE) method, in which the density of detections per area is normalized to the total number of localizations in the images (Fig. 2b). Therefore, this method is quantitative, making thresholding of the data at a given density of localizations per pixel possible. A third visualization method implemented in SMoLR is an adapted scatter plot that depicts the Cartesian coordinates and can add additional data using the size and color of the plotted points (Fig. 2c). This type of visualization can be used to easily assess the quality of the data and detect potential artefacts such as drift during image acquisition or incorrect grouping. Additionally, we provide a function that formats the single-molecule data in such a way that it can be used in the Spatial Point Pattern Analysis R package spatstat . This opens up the possibility to also include spatstats’ wide range of visualization and clustering options in the analysis.
Clustering of SMLM data is comparable to object segmentation in conventional image analysis. Similar to the analysis of objects from segmented images, features can be extracted from the clustered objects to describe the shape and spatial organization within the object. For SMLM data several different approaches for clustering have been proposed in literature, where some of the algorithms are useful to give a global description in the amount of observed clustering, such as Ripley’s K and its derivates, or the recently nonparametric descriptor, J0(r) for clustering density . As previously mentioned, from within SMoLR, the R-package spatstat offers several of these clustering and correlation methods (Ripley-K function, linearized L-function and pair-correlation functions). However, in general, identification of individual clusters is preferred because this allows to analyze the size, shape and spatial distribution of the clusters. In SMoLR, multiple clustering algorithms are available. First, a clustering method based on the binary KDE image can be used to quantify the number of clusters in an image or region of interest (Fig. 2d). We incorporated functions from the EBImage package to calculate image features, such as shape and size, from single clusters . These features together with descriptive statistics (number of localizations, mean position, mean precision, etc.) can be used to categorize individual clusters. Second, the Density Based Clustering Algorithm with Noise (DBSCAN) algorithm is integrated in SMoLR (Fig. 2e) [26, 27]. This frequently used algorithm allows clustering of data based on localization data only. From the defined clusters with localizations, statistics can be calculated such as the cluster area, convex hull and elongation. The earlier mentioned interactive application (Additional file 1: Fig. S1) at this point also allows to manually assess the features (obtained with KDE or DBSCAN clustering) within a data set. Additionally, all parameters can be used for exploration of the data set either manually or using multivariate analysis or machine learning algorithms. Although DBSCAN is able to define clusters and deal with noise, in literature alternative clustering algorithms have been proposed that work better for certain biological samples. Examples are Voronoi tessellation, Bayesian cluster identification and the use of a Gaussian-mixture model [13, 28,29,30]. A comparison of our KDE and DBSCAN implementations with clustering algorithms by Voronoi tessellation [13, 17] and Bayesian statistics  can be found in Additional file 2: Figure S2.
Merging the localizations from a large number of individual SMLM images of single biological structures such as the nuclear pore complex, synaptonemal complex or viral particles proved to be a powerful tool to reconstruct ultrastructure [5, 31,32,33]. However, template free particle averaging is a computationally demanding procedure or requires expensive software . Particle averaging also assumes that individual structures represent identical or at least highly similar structures. However, for some biological structures there might be quite some variation in the organization of the individual structures, although they can have certain features in common. We therefore implemented an alignment algorithm, as will be described below, based on extracted features from the individual images, which can be very informative to observe common features from the imaged structures.
Alignment of individual structures can be achieved using features that can be extracted with the SMoLR package (using pixel- or localization-based features). For example, the center of mass of clusters can be used to center the structures. In some cases, the clusters may have specific shapes that enable to rotate and overlay the individual ROIs. For example, elongated structures can be aligned using the major axis of the structure. The presence of multiple clusters within individual ROIs that can be distinguished from each other (for instance on the basis of shape, size or distance to the center of mass), provides another possibility to align structures by rotating the similar clusters towards the same point. The alignments can be averaged or overlaid, and subsequently used to visualize and extract common features from the individual images. This can be used to compare biological structures at different biological conditions or time points. Additionally, these alignments can reveal the relative location of different proteins within the structure, when aligning the structures using one protein as a reference.
The functions in SMoLR are developed based on 2D-localization data. However, 3D data can be visualized in the scatterplot of SMoLR visualizing the z-coordinate using color or size of the plotted points. In principle the DBSCAN algorithm is not limited to 2D data, however 3D clustering is not implemented directly in SMoLR.
To show the use of SMoLR to analyze single-molecule localization data, we applied the functions of the SMoLR package on a previously published data set with images of proteins involved in DNA double strand break (DSB) repair . Precise determination of spatiotemporal localization and organization of these proteins at the sites of damage and how these relate to specific and general protein functions can help to elucidate the mechanisms by which repair of the DSBs take place. In this example we examined two essential DSB repair proteins, the recombinase RAD51 and the tumor suppressor BRCA2. γ-Irradiated cells were immunostained for RAD51 and BRCA2 and imaged using direct stochastic optical reconstruction microscopy (dSTORM) . Single foci were segmented and visualized using the three visualization techniques available in SMoLR (Fig. 2a-c). Subsequent clustering using KDE, DBSCAN and Voronoi tesselation (spatstat) (Fig. 2d-f) allowed for quantitative analysis of multiple foci including number of clusters per protein, per focus and cluster size versus number of localizations (Fig. 2g-h). These analyses can be extended using e.g. cluster shape, co-localization or relative distance between clusters.
In order to gain insight in the relative distribution of RAD51 and BRCA2 in DSBs we averaged their signal after alignment (centered and rotated) based on the elongated shape of the RAD51 clusters (Fig. 2i). This revealed a distinct pattern of protein distributions during DNA repair (explained in more detail in Sánchez et al., 2017).
Visualization and quantitative analysis of the localization of multiple proteins, below the diffraction limit, within macromolecular assemblies or small organelles, under different conditions and at multiple time points, provides the possibility to gain insight in the spatiotemporal organization of protein function during biological processes. In many situations, multiple similar structures are present within a cell and the recorded super-resolution image. By combining the presented methods and work flow to extract relevant features from the localization data, together with the powerful statistics available in R, it is possible to explore the variation in structures, determine common features describing the structures while at the same time comparing different conditions or proteins. Using feature-based alignment and rotational analysis these observed structural organizations can be verified, visualized and combined with simulations to get more insight. Altogether, the workflow presented in our SMoLR package allows researchers to delve deeper into their single-molecule localization data, beyond conventional image analysis.
Availability and requirements
Project name: SMoLR
Project home page: https://github.com/ErasmusOIC/SMoLR
Operating system(s): Platform independent
Programming language: R
Other requirements: R 3.4.0 or higher
Any restrictions to use by non-academics: no.
Density-based spatial clustering of applications with noise
Double strand break
Direct stochastic optical reconstruction microscopy
Kernel density estimation
Region of interest
Single-molecule localization microscopy
Single-molecule localization in R
Betzig E, Patterson GH, Sougrat R, Lindwasser OW, Olenych S, Bonifacino JS, et al. Imaging intracellular fluorescent proteins at nanometer resolution. Science. 2006;313:1642–5. https://doi.org/10.1126/science.1127344.
Rust MJ, Bates M, Zhuang X. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat Methods. 2006;3:793–5. https://doi.org/10.1038/nmeth929.
Reid DA, Keegan S, Leo-Macias A, Watanabe G, Strande NT, Chang HH, et al. Organization and dynamics of the nonhomologous end-joining machinery during DNA double-strand break repair. Proc Natl Acad Sci U S A. 2015;112:E2575–84. https://doi.org/10.1073/pnas.1420115112.
Sánchez H, Paul MW, Grosbart M, van Rossum-Fikkert SE, Lebbink JHG, Kanaar R, et al. Architectural plasticity of human BRCA2–RAD51 complexes in DNA break repair. Nucleic Acids Res. 2017;45:4507–18. https://doi.org/10.1093/nar/gkx084.
Szymborska A, de Marco A, Daigle N, Cordes VC, Briggs JAG, Ellenberg J. Nuclear pore scaffold structure analyzed by super-resolution microscopy and particle averaging. Science. 2013;341:655–8. https://doi.org/10.1126/science.1240672.
Rossier O, Octeau V, Sibarita J-B, Leduc C, Tessier B, Nair D, et al. Integrins β1 and β3 exhibit distinct dynamic nanoscale organizations inside focal adhesions. Nat Cell Biol. 2012;14:1057–67. https://doi.org/10.1038/ncb2588.
Laine RF, Albecka A, van de Linde S, Rees EJ, Crump CM, Kaminski CF. Structural analysis of herpes simplex virus by optical super-resolution imaging. Nat Commun. 2015;6:5980. https://doi.org/10.1038/ncomms6980.
Dani A, Huang B, Bergan J, Dulac C, Zhuang X. Superresolution imaging of chemical synapses in the brain. Neuron. 2010;68:843–56.
Sage D, Kirshner H, Pengo T, Stuurman N, Min J, Manley S, et al. Quantitative evaluation of software packages for single-molecule localization microscopy. Nat Methods. 2015;12. https://doi.org/10.1038/nmeth.3442.
Pengo T, Holden SJ, Manley S. PALMsiever: a tool to turn raw data into results for single-molecule localization microscopy. Bioinformatics. 2014;31:797–8. https://doi.org/10.1093/bioinformatics/btu720.
El Beheiry M, Dahan M. ViSP: representing single-particle localizations in three dimensions. Nat Methods. 2013;10:689–90. https://doi.org/10.1038/nmeth.2566.
Crossman DJ, Hou Y, Jayasinghe I, Baddeley D, Soeller C. Combining confocal and single molecule localisation microscopy: a correlative approach to multi-scale tissue imaging. Methods. 2015;88:98–108. https://doi.org/10.1016/j.ymeth.2015.03.011.
Levet F, Hosy E, Kechkar A, Butler C, Beghin A, Choquet D, et al. SR-Tesseler: a method to segment and quantify localization-based super-resolution microscopy data. Nat Methods. 2015;12:1065–71. https://doi.org/10.1038/nmeth.3579.
Andronov L, Michalon J, Ouararhni K, Orlov I, Hamiche A, Vonesch J-L, et al. 3DClusterViSu: 3D clustering analysis of super-resolution microscopy data by 3D Voronoi tessellations. Bioinformatics. 2018;34:3004–12. https://doi.org/10.1093/bioinformatics/bty200.
Andronov L, Lutz Y, Vonesch JL, Klaholz BP. SharpViSu: integrated analysis and segmentation of super-resolution microscopy data. Bioinformatics. 2016;32:2239–41.
Malkusch S, Heilemann M. Extracting quantitative information from single-molecule super-resolution imaging data with LAMA – LocAlization microscopy analyzer. Sci Rep. 2016;6:34486. https://doi.org/10.1038/srep34486.
Haas KT, Lee M, Esposito A, Venkitaraman AR. Single-molecule localization microscopy reveals molecular transactions during RAD51 filament assembly at cellular DNA damage sites. Nucleic Acids Res. 2018:1–19. https://doi.org/10.1093/nar/gkx1303.
R Core Team. R: A Language and Environment for Statistical Computing. 2017. https://www.r-project.org/.
Ovesny M, Křižek P, Borkovec J, Svindrych Z, Hagen GM. ThunderSTORM: a comprehensive ImageJ plugin for PALM and STORM data analysis and super-resolution imaging. Bioinformatics. 2014:1–2. https://doi.org/10.1093/bioinformatics/btu202.
Reuter M, Zelensky A, Smal I, Meijering E, van Cappellen WA, de Gruiter HM, et al. BRCA2 diffuses as oligomeric clusters with RAD51 and changes mobility after DNA damage in live cells. J Cell Biol. 2014;207:599–613. https://doi.org/10.1083/jcb.201405014.
Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, Tinevez J-Y, White DJ, Hartenstein V, Eliceiri K, Tomancak P, Cardona A. Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012;9(7):676–682.
Nieuwenhuizen RPJ, Lidke KA, Bates M, Puig DL, Grünwald D, Stallinga S, et al. Measuring image resolution in optical nanoscopy. Nat Methods. 2013;10:557–62. https://doi.org/10.1038/nmeth.2448.
Baddeley A, Turner R. spatstat: An R Package for Analyzing Spatial Point Patterns. J Stat Softw. 2005;12. https://doi.org/10.18637/jss.v012.i06.
Jiang S, Park S, Challapalli SD, Fei J, Wang Y. Robust nonparametric quantification of clustering density of molecules in single-molecule localization microscopy. PLoS One. 2017;12:1–15.
Pau G, Fuchs F, Sklyar O, Boutros M, Huber W. EBImage--an R package for image processing with applications to cellular phenotypes. Bioinformatics. 2010;26:979–81. https://doi.org/10.1093/bioinformatics/btq046.
Ester M, Kriegel HP, Sander J, Xu X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Second Int Conf Knowl Discov Data Min. 1996:226–31.
Hahsler M. dbscan: Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms. 2015. https://cran.r-project.org/web/packages/dbscan/index.html.
Andronov L, Orlov I, Lutz Y, Vonesch J-L, Klaholz BP. ClusterViSu, a method for clustering of protein complexes by Voronoi tessellation in super-resolution microscopy. Sci Rep. 2016;6:24084. https://doi.org/10.1038/srep24084.
Rubin-Delanchy P, Burn GL, Griffié J, Williamson DJ, Heard NA, Cope AP, et al. Bayesian cluster identification in single-molecule localization microscopy data. Nat Methods. 2015;12:1072–6.
Deschout H, Platzman I, Sage D, Feletti L, Spatz JP, Radenovic A. Investigating focal adhesion substructures by localization microscopy. Biophys J. 2017;113:2508–18. https://doi.org/10.1016/j.bpj.2017.09.032.
Van Engelenburg SB, Shtengel G, Sengupta P, Waki K, Jarnik M, Ablan SD, et al. Distribution of ESCRT machinery at HIV assembly sites reveals virus scaffolding of ESCRT subunits. Science. 2014;343:653–6. https://doi.org/10.1126/science.1247786.
Schücker K, Holm T, Franke C, Sauer M, Benavente R. Elucidation of synaptonemal complex organization by super-resolution imaging with isotropic resolution. Proc Natl Acad Sci. 2015;112:2029–33. https://doi.org/10.1073/pnas.1414814112.
Salas D, Le Gall A, Fiche J-B, Valeri A, Ke Y, Bron P, et al. Angular reconstitution-based 3D reconstructions of nanomolecular structures from superresolution light-microscopy images. Proc Natl Acad Sci. 2017;:201704908. doi:https://doi.org/10.1073/pnas.1704908114.
We would like to thank prof. dr. Claire Wyman and dr. Ihor Smal for helpful discussions.
This work has been supported by NWO-CW ECHO 104126 and STW Nanoscopy program.
Availability of data and materials
Software is available online at https://github.com/ErasmusOIC/SMoLR and additional example data https://github.com/ErasmusOIC/SMoLR_data .
The data sets analyzed are described in Sanchez et al.  are available from the corresponding author on request.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1:
Figure S1. Interactive application for inspection of SMLM data (A) Shiny application loaded with indicated data is run within the R environment on a local server in a web browser. (B) Feature parameters can be show in a scatter plot or (C) binned in a histogram. (D) Data points inside the scatterplot or bins in the histogram can be manually selected and corresponding clusters are then indicated in the image (green is selected), structures of interested can be enlarged and inspected. (PDF 944 kb)
Additional file 2:
Figure S2. Comparison of cluster algorithms: Four cluster algorithms were compared KDE and DBSCAN from the SMoLR package and Voronoi and Bayesian clustering from external packages. (A) A test data set containing 6 circular clusters of 50 localizations (1–6) and one cluster of 100 localization consisting of two overlapping clusters (7) (red dots) and 300 uniformly distributed (incorrect) localizations due to noise. (B-C) KDE, DBSCAN, and Bayesian clustering of the test data set using default settings. For Voronoi clustering, the approach as described in Haas et al. was used, using an implementation in R (a threshold of two times the medial tile area of Voronoi tessellation was used to select clustered localizations). Non-clustered localizations are depicted in red, while clustered localizations are indicated as a separate color per cluster (orange to green) and numbered from 1 to 7. Indicated performance parameters are: 1), the number of individual positive clusters detected (fused clusters are counted as one), 2), number of false clusters identified (arrow), 3), the percentage of noise localizations that have been assigned to a cluster and, 4), the percentage of signal localizations that are assigned to a cluster. (PDF 3804 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Paul, M.W., de Gruiter, H.M., Lin, Z. et al. SMoLR: visualization and analysis of single-molecule localization microscopy data in R. BMC Bioinformatics 20, 30 (2019). https://doi.org/10.1186/s12859-018-2578-3
- Single-molecule localization
- Image quantification
- Image analysis