Dual channel rank-based intensity weighting for quantitative co-localization of microscopy images
© Singan et al; licensee BioMed Central Ltd. 2011
Received: 20 May 2011
Accepted: 21 October 2011
Published: 21 October 2011
Accurate quantitative co-localization is a key parameter in the context of understanding the spatial co-ordination of molecules and therefore their function in cells. Existing co-localization algorithms consider either the presence of co-occurring pixels or correlations of intensity in regions of interest. Depending on the image source, and the algorithm selected, the co-localization coefficients determined can be highly variable, and often inaccurate. Furthermore, this choice of whether co-occurrence or correlation is the best approach for quantifying co-localization remains controversial.
We have developed a novel algorithm to quantify co-localization that improves on and addresses the major shortcomings of existing co-localization measures. This algorithm uses a non-parametric ranking of pixel intensities in each channel, and the difference in ranks of co-localizing pixel positions in the two channels is used to weight the coefficient. This weighting is applied to co-occurring pixels thereby efficiently combining both co-occurrence and correlation. Tests with synthetic data sets show that the algorithm is sensitive to both co-occurrence and correlation at varying levels of intensity. Analysis of biological data sets demonstrate that this new algorithm offers high sensitivity, and that it is capable of detecting subtle changes in co-localization, exemplified by studies on a well characterized cargo protein that moves through the secretory pathway of cells.
This algorithm provides a novel way to efficiently combine co-occurrence and correlation components in biological images, thereby generating an accurate measure of co-localization. This approach of rank weighting of intensities also eliminates the need for manual thresholding of the image, which is often a cause of error in co-localization quantification. We envisage that this tool will facilitate the quantitative analysis of a wide range of biological data sets, including high resolution confocal images, live cell time-lapse recordings, and high-throughput screening data sets.
KeywordsQuantitative co-localization image analysis non-parametric rank correlation intensity weighting
The presence of a wide array of organelles enables eukaryotic cells to perform multiple and even competing biological processes in parallel. The cellular distribution of these organelles, and in particular the proteins and other molecules associated with them, remains of intense interest to the scientific community. Indeed the identification and understanding of the localization of all the proteins encoded by the genome can be considered as a first critical step towards assigning function . Following the completion of various genome-sequencing projects many labs have been highly active in developing methodologies and tools to systematically assess the localization of the proteins encoded. Imaging-based technologies are particularly relevant to this task as they not only provide spatial information in a cellular context, but when applied in living cells can also provide information about protein dynamics over time. The first genome-wide assessment of protein localization was carried out in the yeast Saccharomyces cerevisiae, using a green fluorescent protein (GFP)-tagging approach . Similar systematic approaches to reveal the localization of the human proteome have also been described [3, 4], however this task remains to be completed. Apart from the sheer complexity of mammalian genomes and their extensive transcriptional products, systematic analysis of protein localization in higher eukaryotes is also hampered by a lack of software tools to aid in automated determination of localization. Some tools to automatically annotate localization have been reported , however due to the highly diverse morphology of the same organelles between different cells this approach remains challenging. One potential solution to this problem is to combine different fluorescently-labeled proteins (or fluorescently-labeled antibodies) in the same cells. Quantification of the abundance of a molecule, and its relative distribution compared to known organelle markers (co-localization), would provide a more accurate description of localization, and potentially could be applied on a genome-wide scale.
Co-localization, representing the co-compartmentalization of specific molecules, can be defined as the existence of spatial overlap between two molecules. The existence of overlap can be most simply determined by visual inspection of merged channels (although this is subjective and dependent on the expertise of the researcher). A second possibility is the use of a scatter plot - a 2D histogram representing the pixel intensities across two colour channels from a merged image. The component along the diagonal of this plot represents co-localized regions. This plot however assumes that the intensities across the two images are similar, which is not often the case, and therefore can produce aberrant results. A linear least-square fit of intensities of the two channels can be used to normalize for the differences, but this is not easily applicable to large numbers of diverse images.
In this equation A i and B i are the intensities of pixel i, and A avg and B avg are the average intensities of channels A and B respectively. Although PC is used widely by the microscopy community to assess co-localization, the PC value generated is highly sensitive to the intensity in each channel. In microscopy images, the intensity values acquired from two channels can be highly different as a result of many factors, including nature of the organelle or protein under investigation, the brightness of the fluorophores, and the manner in which the images were generated. The high sensitivity of PC to channel intensities can therefore cause skewed results, and so awareness of this is vital.
One of the main drawbacks is the overestimation of co-localization quantified in these methods. In each of these cases, the contribution of a pixel to quantification of co-localization is binary, by classifying them as either 100% co-localized or 0% co-localized. In calculating the percentage of co-localization of the first channel with the second channel, while weighting is given to the intensity values in that first channel (Channel A for M1), the intensity value of the corresponding pixel position in the second channel (Channel B) is completely ignored. However, the intensity of the corresponding pixel position in the other channel indicates the relative abundance of the molecule of interest in that channel and therefore this should be accounted for. Differences in pixel intensities across the two channels render it difficult to combine these values and hence the classification is binary.
While PC estimates the overall correlation of intensities within a particular region of interest, it does not discriminate overlapping pixels from non-overlapping pixels. If the relative number of non-overlapping pixels is larger than the number of overlapping pixels, the correlation can be skewed by the intensity variation in the non-overlapping pixels. The Manders' coefficient is insensitive to intensity correlation and therefore the co-localization is quantified as a ratio of overlapping pixels to the total number of pixels. In order to address these deficiencies in the currently available co-localization algorithms, we have devised a new algorithm that not only takes account of correlating pixels between two channels, but also considers their relative intensities. We propose that this 'dual channel rank-based intensity weighting coefficient' (RWC) provides the most accurate measurement to date of co-localization between two image channels.
Results and Discussion
Rank-based intensity weighting
In this formula weight , A i, coloc = A i if B i > B Thr , 0 otherwise, and B i, coloc = B i if A i > A Thr , 0 otherwise. Rn is the maximum of ranks of channel A and B, whichever is the largest, and D is the absolute difference between the ranks of channel A and channel B for each pixel position i given by D i = |(Rank(A i ) - Rank(B i )|. Parameters A Thr and B Thr are threshold values for channels A and B, respectively. The provision of defined threshold values is not necessary in this formula, as the ranking of pixels already discriminates low intensity pixels from high intensity pixels. However, by including these threshold parameters in the formula, users can still control the minimum pixel intensity above which co-localization quantification should be performed. Furthermore, it also allows easy comparison with other co-localization methods that require threshold information to be manually entered. If no manual intervention is required, the threshold values A Thr and B Thr are set to zero.
The ranking of pixels is made in each channel by giving the pixel(s) with the highest intensity a rank of 1 and assigning the next highest intensity pixel(s) a rank of 2, and so on. Pixels having the same intensities are assigned the same rank. The number of ranks in each channel depends on the number of grey levels in that channel. This method of ordinal ranking of pixels normalizes for the intensity values without altering the image. Each pixel in each channel gets a rank based on its intensity relative to the highest intensity in its channel. For an n-bit image, the ranks in each channel can range from 0 (for an image with no signal) to 2 n (for an image having all the possible grey levels). The number of ranks in each of the channels is determined, and 'Rn' is assigned to the largest of these values.
The weighting for each pixel position is derived from the following expression, . If a pixel position in each of the channels has the same rank, the expression will tend to 1, thereby the weight has minimal contribution to the co-localization coefficient. The further apart the ranks of the pixel position, the more the weight will tend to , and for extreme rank differences of which the maximum D i can be Rn-1, the weight will be . The weight can range from to 1 corresponding to the maximum rank difference to the same rank, respectively. For an n- bit image, the maximum possible range is when all the grey levels are present in one of the two channels and this will range from to 1. The greater the number of grey levels present, the higher is the sensitivity and resolution of weighting. The sensitivity of weight depends on Rn and the sensitivity can be reduced by modifying the weight to where Rn' is a linear algebra equation derived from Rn such that Rn' = Rn + k*Rn, and k can take values from 0 to 1 and correspond to weights ranging from to 1 for k = 0 and 0.5 to 1 for k = 1. The absolute difference between ranks ensures that the same weighting can be used for co-localizing pixel positions in both the channels and the weighting depends only on the difference of ranks. We envisage that this ranking approach could also be used for segmentation, for example to identify particular objects within an image based on a reference channel.
The weight represents the relative amount of co-localization and this can then be used for each pixel position to determine the degree of co-localization. Rank-based weighting addresses the critical issues of difference in channel illumination, dual channel directional illumination, and uniform noise and gradient correlation, as the ranks are preserved even though the actual intensities might suffer degradation in all of these cases. This method demonstrates a statistically efficient meta-analysis approach of combining both pixel co-occurrence and intensity correlation to improve co-localization analysis.
Synthetic data sets
Applying the Costes' mask to the data in Figure 4 allowed us to determine that the proportion of pixels above the threshold was 87.5%, 37.5%, 0% and 12.5% respectively in each of the co-localization experiments (Figures 4A, B, C and 4D). We first analyzed the co-localization experiment in which the two channels were identical (Figure 4A). As expected, applying Costes' automated threshold resulted in 12.5% of the pixels being discarded from the mask, however because the images were perfectly correlating, the co-localization coefficients (M1C and M2C) were correctly calculated at 1.0 (Figure 4A). By contrast, in co-localization experiments in which the intensities did not correlate (Figures 4B, C and 4D), the majority of the pixels were discarded from the mask leading to incorrect co-localization coefficients. This was particularly striking in cases where there were pixels co-occurring in both channels, but anti-correlation of the intensities resulted in failure of the automated thresholding, in turn producing co-localization coefficients (M1C and M2C) of zero (Figure 4C). This scenario is especially relevant in biological samples where two molecules could have anti-correlating intensities, despite co-occurring. Applying the RWC algorithm to the synthetic data set in Figure 4 produced more realistic co-localization coefficients. This was because the algorithm considers both co-occurrence (specified by the threshold) and correlation. Even in the example of anti-correlating pixels, the RWC approach produces co-localization coefficients (RWC1 and RWC2) of 0.48, which are more representative of the intensity distribution (Figure 4C). Next we examined this synthetic data set without applying a threshold. Strikingly, in all the four cases, the Manders' algorithm always reported a co-localization coefficient of 1.0, whereas RWC produced similar values to those observed when the images had been thresholded. This highlights that the application of the Manders' algorithm always requires the application of careful thresholding, but that RWC is not sensitive to this requirement as it produces meaningful co-localization coefficients in the absence of thresholding. Use of the RWC methodology therefore eliminates the need for thresholding, which can be a source of significant bias when analyzing image data.
Effect of random noise on Manders and RWC co-localization coefficients
Noise SD 5
Noise SD 10
Noise SD 15
Co-loc A on B
Co-loc B on A
Co-loc A on B
Co-loc B on A
Co-loc A on B
Co-loc B on A
M1 = 1.0
M2 = 1.0
M1 = 0.99
M2 = 0.99
M1 = 0.99
M2 = 0.99
RWC1 = 0.98
RWC2 = 0.98
RWC1 = 0.95
RWC2 = 0.95
RWC1 = 0.92
RWC2 = 0.92
M1 = 0.90
M2 = 0.83
M1 = 0.88
M2 = 0.84
M1 = 0.89
M2 = 0.85
RWC1 = 0.62
RWC2 = 0.61
RWC1 = 0.62
RWC2 = 0.61
RWC1 = 0.61
RWC2 = 0.61
M1 = 0.76
M2 = 0.76
M1 = 0.76
M2 = 0.76
M1 = 0.77
M2 = 0.77
RWC1 = 0.5
RWC2 = 0.5
RWC1 = 0.49
RWC2 = 0.49
RWC1 = 0.49
RWC2 = 0.49
M1 = 0.83
M2 = 0.90
M1 = 0.84
M2 = 0.88
M1 = 0.85
M2 = 0.89
RWC1 = 0.61
RWC2 = 0.62
RWC1 = 0.61
RWC2 = 0.62
RWC1 = 0.61
RWC2 = 0.61
Biological data sets
We next performed a co-localization experiment with two different primary antibodies that recognize different membranes within the cell, specifically the chaperone HSP60 representing the mitochondria, and the putative cargo receptor TGN46 that has a steady-state localization at the trans-Golgi network (TGN) . Confocal microscopy analysis revealed that both the Manders' and RWC algorithms could accurately discriminate these different membranes and produce low co-localization coefficients (Figure 5B). By contrast the co-localization coefficient determined by the Costes' algorithm performed very poorly, most likely as a result of the way it determines thresholds based on intensity correlations.
Rather than considering individual pixel intensities within an image, an alternative method of probing co-localization is to use object-based analysis . This approach relies on the ability to discriminate and segment defined objects (of similar pixel intensity). Typically the centroid of each object is used as a reference point for comparison between the channels, and the number of co-localizing objects, as a fraction of the total number of objects detected, defines the degree of co-localization. We applied such a method to our HSP60-TGN46 biological data (as used in Figure 5B) using the JACOP plugin within ImageJ . Applying the same thresholds as used previously, this analysis revealed vastly differing co-localization values for the same set of images, depending on the minimum pixel size used to determine objects. For example, at a minimum value of 5 pixels, 85 TGN46 objects were identified, of which only 3 co-localized with HSP60 (Figure 5C). However, increasing the minimum pixel size to 500 pixels, resulted in the detection of only 3 discrete objects, of which only 1 co-localized with HSP60. The consequence of this large discrepancy in the numbers of objects identified resulted in an overall change in object-based co-localization co-efficient from 0.04 to 0.23 for the test set of images analyzed. This indicates that while object-based co-localization methods can produce coefficients similar to co-occurrence methods, they are wholly dependent on the object segmentation and identification parameters given by the user. Moreover, the pixel intensity information is only used in the segmentation process rather than for quantification of co-localization, meaning that this valuable information is effectively discarded.
In this work we present a novel tool to precisely quantify co-localization between structures within biological images. Although a number of co-localization algorithms have been described previously, this is the first example of such a tool that takes account of both co-occurrence and correlation of pixels, combining them efficiently to produce a meaningful coefficient value. We demonstrate in this work, using both synthetic and biological data sets, that this algorithm is a robust tool that works effectively across a very wide range of situations, and that it eliminates the need for manual thresholding of images, which is a well established cause of error in co-localization analyses. We envisage that this tool will facilitate the quantitative analysis of a wide range of biological data sets, including high resolution confocal images, live cell time-lapse recordings, and high-throughput screening data sets.
In this formula, A i and B i are the intensities of pixels at position i in channels A and B respectively in the initial pair of images. A new is the new intensity of channel A at position i for the new image. Varying C f from 0 to 1, with a step size of 0.1, generates correlations ranging from 0% to 100% (in 10% increments), between new image A and the initial image B. To introduce noise we used the 'Add Specific Noise' routine within ImageJ, setting this at 5, 10 and 15 standard deviations for various test cases.
Cell Culture, Transfection and Immunostaining
HeLa cells were routinely cultured in Dulbecco's Modified Eagle's Medium (DMEM) supplemented with 10% foetal bovine serum (FBS) and 1% L-Glutamine at 37°C in a 5% CO2 incubator. Experiments were carried out on cells growing on glass coverslips maintained in 12-well plates. All transient transfections were performed using Fugene6 according to the manufacturer's instructions. The CFP-ts045G construct has been described previously . Cells were fixed with either methanol or 3% PFA and quenched with 30 mM glycine. Primary mouse anti-HSP60 antibodies (BD Biosciences) sheep anti-TGN46 antibodies (Biozol), and mouse anti-p230 antibodies (BD Biosciences) were used. Secondary antibodies anti-mouse Alexa488 and Alexa568, and anti-sheep Alexa568 (Molecular Probes) were used for visualization. Coverslips were mounted on glass slides with Mowiol.
Image Acquisition and Analysis
Confocal images were acquired with an Olympus FV1000 system using a 60x/NA1.35 oil immersion objective. Images were acquired at a resolution of 1024*1024 pixels, a pixel dwell time of 12.5 μs, and a 2.5-fold zoom. Sequential acquisition mode was used in all cases. Individual cells from each field of view were manually segmented, but not subjected to background correction or any further manipulation. A minimum of 10 cells were used for quantification of each time point and immunostaining. The Rank Weight Co-localization Coefficient (RWC) was implemented in ImageJ.
HeLa cells cultured on coverslips were transfected with plasmids encoding CFP-ts045G and incubated at 39.5°C for 12 h to accumulate the ts045G in the ER. Following this incubation cycloheximide (100 μg/ml) was added to prevent further protein synthesis. The temperature was lowered to 32°C to allow folding and release of ts045G from the ER. Coverslips were removed from this incubation at various time points and fixed in 3% PFA. Prior to immunostaining the cells were permeabilized with 0.1% Triton X-100 and then incubated with anti-GM130 or anti-p230 antibodies followed by incubation with secondary antibodies as described above.
We gratefully acknowledge the support of Anne Carpenter's lab (Broad Institute Imaging Platform), and in particular Carolina Wählby. We also acknowledge helpful discussions from members of the JCS lab. This work has been partly funded by an Irish Research Council for Science, Engineering and Technology (IRCSET) graduate PhD scholarship in Bioinformatics and Systems Biology to VRS. The JCS lab is supported by a Principal Investigator (PI) award (09/IN.1/B2604) from Science Foundation Ireland (SFI).
- Simpson JC, Pepperkok R: Localizing the proteome. Genome Biol 2003, 4(12):240. 10.1186/gb-2003-4-12-240PubMed CentralView ArticlePubMedGoogle Scholar
- Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK: Global analysis of protein localization in budding yeast. Nature 2003, 425(6959):686–691. 10.1038/nature02026View ArticlePubMedGoogle Scholar
- Simpson JC, Wellenreuther R, Poustka A, Pepperkok R, Wiemann S: Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing. EMBO Rep 2000, 1(3):287–292. 10.1093/embo-reports/kvd058PubMed CentralView ArticlePubMedGoogle Scholar
- Mehrle A, Rosenfelder H, Schupp I, del Val C, Arlt D, Hahne F, Bechtel S, Simpson J, Hofmann O, Hide W, Glatting KH, Huber W, Pepperkok R, Poustka A, Wiemann S: The LIFEdb database in 2006. Nucleic Acids Res 2006, (34 Database):D415–418.PubMed CentralView ArticlePubMedGoogle Scholar
- Conrad C, Erfle H, Warnat P, Daigle N, Lörch T, Ellenberg J, Pepperkok R, Eils R: Automatic identification of subcellular phenotypes on human cell arrays. Genome Res 2004, 14(6):1130–1136. 10.1101/gr.2383804PubMed CentralView ArticlePubMedGoogle Scholar
- Manders EMM, Verbeek FJ, Aten A: Measurement of co-localization of objects in dual-colour confocal images. J Microsc 1993, 169: 375–382. 10.1111/j.1365-2818.1993.tb03313.xView ArticleGoogle Scholar
- Costes SV, Daelemans D, Cho EH, Dobbin Z, Pavlakis G, Lockett S: Automatic and quantitative measurement of protein-protein colocalization in live cells. Biophys J 2004, 86: 3993–4003. 10.1529/biophysj.103.038422PubMed CentralView ArticlePubMedGoogle Scholar
- Bolte S, Cordelières FP: A guided tour into subcellular colocalization analysis in light microscopy. J Microsc 2006, 224: 213–232. 10.1111/j.1365-2818.2006.01706.xView ArticlePubMedGoogle Scholar
- Prescott AR, Lucocq JM, James J, Lister JM, Ponnambalam S: Distinct compartmentalization of TGN46 and beta 1,4-galactosyltransferase in HeLa cells. Eur J Cell Biol 1997, 72(3):238–246.PubMedGoogle Scholar
- Zilberstein A, Snider MD, Porter M, Lodish HF: Mutants of vesicular stomatitis virus blocked at different stages in maturation of the viral glycoprotein. Cell 1980, 21(2):417–427. 10.1016/0092-8674(80)90478-XView ArticlePubMedGoogle Scholar
- Schwaninger R, Beckers CJ, Balch WE: Sequential transport of protein between the endoplasmic reticulum and successive Golgi compartments in semi-intact cells. J Biol Chem 1991, 266(20):13055–13063.PubMedGoogle Scholar
- Presley JF, Cole NB, Schroer TA, Hirschberg K, Zaal KJ, Lippincott-Schwartz J: ER-to-Golgi transport visualized in living cells. Nature 1997, 389(6646):81–85. 10.1038/38001View ArticlePubMedGoogle Scholar
- Scales SJ, Pepperkok R, Kreis TE: Visualization of ER-to-Golgi transport in living cells reveals a sequential mode of action for COPII and COPI. Cell 1997, 90: 1137–1148. 10.1016/S0092-8674(00)80379-7View ArticlePubMedGoogle Scholar
- Adler J, Parmryd I: Quantifying colocalization by correlation: the Pearson correlation coefficient is superior to the Mander's overlap coefficient. Cytometry A 2010, 77(8):733–742.View ArticlePubMedGoogle Scholar
- Keller P, Toomre D, Díaz E, White J, Simons K: Multicolour imaging of post-Golgi sorting and trafficking in live cells. Nat Cell Biol 2001, 3(2):140–149. 10.1038/35055042View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.