Volume 14 Supplement 10
Multi-scale Gaussian representation and outline-learning based cell image segmentation
© Farhan et al; licensee BioMed Central Ltd. 2013
Published: 12 August 2013
High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation.
We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scale-space representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information.
Results and conclusions
We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks.
High-throughput screening used in drug design involves identification of genes which modulate a particular biomolecular pathway. RNA interference (RNAi), by decreasing the expression of particular genes in a cell culture, helps in identifying and analyzing the target gene functions in the cells by observing the cell behavior after gene knockdown [1–3]. Image analysis is at the center stage of such studies where cell cultures are imaged with automated fluorescent microscopy to study the cell behavior in knockdown as well as in normal condition. Genome-wide high-content siRNA screening involves studying the dynamics of gene expression in cellular functions for the whole genome and therefore yields hundreds of thousands of images making their manual analysis impractical . Quantitative image analysis is needed for the identification, classification and quantification of the phenotypes which is also not possible through manual analysis [3, 4]. Consequently, fast enough automated image analysis methods are needed to fulfill the potential of high-throughput system.
Segmentation of cells is typically at the core of the image analysis pipelines dealing with high-content genome-wide screening experiments [4, 5]. This is generally the step which performs cell detection and further analysis, such as cell tracking and lineage reconstruction and cell classification, is based on the results of cell detection. However, in such experiments, segmentation is challenging due to presence of large number of phenotypes. Different cell phenotypes have different characteristics and appearances and, for some complex and heterogeneous cell cultures, it is difficult to build analysis capable of detecting all the phenotypes, potentially leading to the loss of some phenotypes. Accurate cell segmentation and detection is therefore essential for quantification of phenotypes.
One of the main challenges in cell segmentation is the cells touching and clustering together, forming a clump. Not only the cytoplasms form clumps but clustering of nuclei is also quite common. The latter problem has been tackled in our recent article . The problem with cytoplasm region in general, and specifically with their clumps, is that they do not often have visible boundaries. Due to this reason, and also due to their irregular shapes, the methods typically in use for clump splitting often fail . The other challenge often faced in cytoplasm segmentation is uneven and varying actin signal. Imaging aberrations cause actin signal to be saturated at some locations and to be too low on other locations for being regarded as part of the cell. This causes methods based on global image segmentation methods to fail. Another similar challenge that lies in cytoplasm segmentation is that the inside of the cells is inhomogeneous, consequently the intensity variations are large. Sometimes, part of the cell cytoplasm resembles the background and the methods solely based on image intensity are often found struggling in such situations . However, if along with image intensity, other features locale to those regions are examined, the difference between background and cytoplasm could be highlighted. In addition to all this, uneven illumination and out of focus regions of the image also cause problems in getting accurate segmentation results.
Methods for cell cytoplasm segmentation available in literature can be mainly divided into two approaches: classic segmentation methods and deformable model-based methods. The former includes watershed transform, region growing, and mathematical morphology methods etc., see for example [8, 9], whereas the latter comprises active contour , level set [11, 12] and graph cut based methods . Authors in  developed a method in which watershed algorithm with double thresholds is followed by splitting and merging of cellular regions based on quality metric obtained by correctly classified cells. Classification of cells is performed using a set of features with a priori information about the cells. In , enhancement of high intensity variations in the actin channel is performed by variance filtering. The enhanced image is then smoothed and thresholded using Otsu thresholding method. Subsequently, seeded watershed transform is applied which is restricted to the binary image of the cytoplasm. In another method , region growing algorithm and modified Otsu thresholding are used to extract the cytoplasm. Long and thin protrusions on spiky cells are extracted by scale-adaptive steerable filter. Finally, constraint factor graph cut-based active contour method and morphological algorithms are combined to separate tightly clustered cells.
In a method described in , the interaction between cells is modeled using a combination of both gradient and region information. Energy function is formulated based on an interaction model for segmenting tightly clustered cells. The energy function is then minimized using a multiphase level set method. Markov Random Fields (MRF) based graphical segmentation model yielding energy minimization problem is also applied to cell cytoplasm segmentation where graph cut method is used to obtain an exact MAP solution . Similarly Potts model, where functions of higher-order cliques of pixels are included into the traditional Potts model, combined with learning methods for defining the potential functions accounting for local texture information are used to segment live cell images in .
The problem with these methods is that they tend to produce over- and/or under-segmentation, for example, classic segmentation methods. Also, they are sometimes computationally-intensive and slow or they depend on schemes which require parameter initialization, and finding a good set of initial parameters for large heterogeneous dataset often requires user intervention which hinders development of automated analysis pipelines . Moreover, when the cells are non-convex, as in our case, the methods available for segmentation of convex objects do not work, nor do the methods which are based on shape priors.
When cells clump together the cytoplasm outlines become invisible, however the intensity and other features along that part of the image are quite similar to the features of other cell outlines that are visible. Therefore, a segmentation methodology can be developed in which the outlines of the cell cytoplasm are learned by a supervised machine learning algorithm. There are methods in literature [17–20] which use the technique of learning edges for segmentation and object detection. However, all of them detect and model outlines which are distinct, where the outlines are basically used to detect objects or regions in the image utilizing shape information wherever available. In contrast, we need an outline detection technique which not only detects distinct outlines but is also capable of revealing outlines to separate objects of unknown shapes from each other.
In this paper we propose a supervised learning and classification-based cell cytoplasm segmentation methodology in which the outlines of the cell cytoplasm are learnt and detected. A multi-scale approach is used to get the cytoplasm/background segmentation and the detected outlines are overlaid to get the complete segmentation. The results from the classification framework are fed to post-processing phase, where the methodology uses the nuclei segmentation  as contextual information to refine the segmentation results.
The rest of the paper is organized as follows: In the Methods section, we describe the proposed cell cytoplasm segmentation methodology. The obtained results are presented and discussed in Results and discussion section. The last section concludes the paper.
Cell cytoplasm segmentation
The first step in our segmentation methodology is robust cytoplasm/background segmentation. As we mentioned earlier, there are many aberrations linked with high-throughput fluorescent microscopy imaging systems. Briefly described, the images typically have low contrast, with blurred regions around the image corners, varying signal strengths, inhomogeneous cell interiors and they also sometimes have uneven illumination. Generally, cytoplasm images appear to be most affected by these problems as far as their accurate segmentation is concerned.
Apart from these imaging related challenges, the other challenge that we face is posed by our dataset which includes cells with high phenotypic variability. Examples of challenging phenotypes are ruffles and spikes in cell boundary and other kinds of outline variations. A segmentation method robust enough to detect such fine details from the noisy and low contrast images is needed for distinguishing different phenotypes. Our approach is to first apply enhancement and correction to the images before applying any segmentation method. Here, we use a cascade of three image and contrast enhancement filters for image pre-processing and a multi-scale approach for getting the desired initial cytoplasm/background segmentation. Block (A) in Figure 1 shows the steps performed in getting initial cytoplasm segmentation.
Multi-scale coefficient of variation based cytoplasm segmentation
where b is the number of bits used to represent the image. This enhancement in image increases the difference between the darkest cytoplasm pixel and the brightest background pixel and a simple intensity threshold-based method such as Otsu segmentation  is able to give the desired cytoplasm/background segmentation. Figure 2(b) shows a gray-scale pre-processed cytoplasm image, 2(c) the coefficient of variation image and 2(d) the resulting image with cytoplasm/background segmentation. From the figure, it is quite evident that our method is able to detect the cytoplasmic regions correctly despite the presence of intensity inhomogeneities.
Classification-based cell cytoplasm outline detection
The cytoplasm segmentation obtained in the previous step still has cytoplasms of different cells touching each other. This is the step in which we detect the cytoplasm outlines and apply them to the result of the previous step for getting the whole cell segmentation. As we mentioned earlier, even if the cytoplasm outlines are invisible, especially in the regions where cytoplasms clump, the intensity and other features of the pixels with underlying outlines still closely match the features of outline pixels that are clearly visible. This leads us to an approach in which a classifier is trained to classify a pixel as either outline or non-outline based on the set of local features extracted from the image pixels.
Filtering operations and the filter parameters for computing pixel-level features from training images.
Gaussian low pass
kernel width σ
Integrated pixel intensity
Laplacian of Gaussian
kernel width σ
Difference of Gaussian
kernel width σ
Local binary pattern
(Min., Med., Max.)
Total number of features.
Extraction of features
The complexity and accuracy of a classifier depends upon the number and distinguishing nature of the features used for classifier design. Selection of the most informative features from a list of candidate features reduces the model complexity yet it needs to be performed such that the model yields high classification accuracy. Sparse model using only a subset of the available features allows us to keep the initial feature set large with as many general and redundant features as desired. Moreover, the benefit of using large and general rather than small and problem-specific feature set is that the framework generalizes to other similar classification problems. Hence, we employ an exhaustive set of generic linear and non-linear features knowing our feature selection technique has been successfully used for building sparse classification models in similar use cases .
In our study, pixel-level features are extracted from 2D cytoplasm images by applying a large set of filters on them, both in spatial and transform domain, with varying parameters. In , the authors use a large generic set of intensity-based features along with textural feature such as local binary pattern (LBP)  for image segmentation. Our cytoplasm images possess interesting texture characteristics which might be useful in classification of image pixels. Therefore, in addition to the local binary patterns and other intensity features used in , we also incorporate texture features such as the ones obtained from Gabor filters  and Haralick  features in our classifier design. The feature set comprises general intensity, edge, texture (scale and orientation) based local features which are computed in the pixel neighborhoods using filters with varying kernel sizes. Table 1 lists all the features that are computed for the training images.
Design of classifier incorporating feature selection
High-dimensionality of the observations leads to the risk of over-fitting at the cost of generalization of the solution and reduction of feature space is desired. However, selection of the most informative features from a feature set for modeling data characteristics has always been problematic. In case of multiple linear regression modeling, regularization is a process which adds a penalty term to the least square prediction error to shrink the magnitude of model coefficients towards zero. Thus a sparse solution with only few non-zero coefficients is obtained and feature selection is performed automatically. Least absolute shrinkage and selection operator (LASSO)  is a technique which penalizes the error function using l1-norm of coefficient vector along with a regularization parameter λ > 0 which controls the sparsity of the solutions. This is another characteristic of this framework, that is, its provision of a set of solutions which usually has increasing sparsity for an increasing value of λ. The advantage in it is that it helps in choosing a solution with as many features desired with little or no major change in the classification result, that is, a solution with a small trade-off between accuracy and model sparsity/complexity.
whose quadratic approximation gives rise to an equivalent penalized iteratively re-weighted least squares problem that can be solved by coordinate descent algorithm .
Training and classification
In order to perform training and classification, manually created benchmark images with cytoplasm outlines are used. We have a set of training samples, around 550 cells (5 images) and 1250 cells (16 images) for Test Case I and Test Case II, respectively, segmented manually by expert biologists, see details regarding image acquisition in later section. It is worth-mentioning that, while choosing the images for benchmarking, the criteria was to pick those images which contain most of the image area covered with cells and also the chosen images present one of those cases which are the most challenging as far as getting accurate segmentation is concerned. Since all the images are 1040× 1392 (Test Case I ) and 400× 400 (Test Case II ) in size, even the pixels of a single image are sufficient enough to train the classifier, especially the classifier of our type which is capable of dealing with even P ≫ N cases. Therefore, one of the images is used solely for training of the classifier while the rest of the images are used for evaluating the classifier. This way we made sure not to use the same data for both training and testing.
For training, 500 positive (outline) and 500 negative (non-outline) samples are picked at random from 1447680 or 160000 samples in the benchmarked image of cytoplasm outlines. For these 1000 samples, all the 290 features listed in Table 1 are extracted from the corresponding cytoplasm image. This training data of 1000× 290 feature vector along with 1000× 1 target labels is input to the regularized logistic regression classifier. For testing, only the selected features are calculated for every pixel in the test images to be used with the selected model for outline classification.
In order to estimate the optimal classifier model coefficients, 10-fold cross-validation is performed on the training data to estimate the prediction error of all the solutions obtained for different values of regularization parameter λ. The solution which gives the minimum prediction error is generally chosen, however, it can be left to the discretion of the designer to pick an even more sparse solution with little or no major impact on the final classification results. In our case, we observed that models within one standard error of the mean cross-validation error do not change the classifier output significantly. Finally, the selected model for the classifier gives the posterior probability values for the pixels in the test image which is used directly to find the class label (outline/non-outline) for every pixel.
Post-processing of the classifier outputs is generally a complementary part of any classification framework. One of the techniques used for post processing exploits the contextual information obtained either from the targeted patterns, which, in our case is cytoplasm images, or from some other source related to them. The classifier that we obtained to classify image pixels as outline/non-outline gives accurate yet coarse results. The coarseness mainly comes from the fact that sometimes the pixels interior to the cytoplasm are given the outline labels due to similarity of their features with outline pixels which was actually caused by varying and inhomogeneous actin signal. Moreover, due to binary outputs, that is, the threshold probability value of 0.5, the classifier tends to give thick outlines because many pixels close to the actual outline have similar features with little variations among them. Also, again due to varying signal strength or due to noise, quite often the detected outlines are non-connected, whereas, the desired solution is to have closed contour outlines for cytoplasms. Therefore, we need to refine the classifier output and transform it in such a way that we get single-pixel length closed outline contours.
In eukaryotic cells, nucleus is the main indicator of a cell. We have the DNA-channel nuclei images which provide a solid basis to find the individual cells, or to detect individual cell cytoplasm outlines in the actin-channel cytoplasm images. In cell images, nucleus is generally located at the central portion of the cell. Most importantly, we can certainly assume that the pixels occupied by the nucleus can never be occupied by the cell outlines. Therefore, nuclei images provide contextual information for post-processing of the classifier output. Mainly, they are used to filter out the misclassified outline pixels lying inside the cell. In the same context, they are also used to refine the result of initial segmentation to fill underlying small holes occurring due to intensity inhomogeneities. This image is then inverted and unified with the filtered outline image to further strengthen the outlines.
Once the outlines are filtered, their thinning is performed by morphological skeletonization to get single-pixel length outline contour. Skeletonization is preferred over morphological thinning since it gives not only accurate contour in terms of its location but it also gives non-connected branches wherever available. These branches occur either due to discontinuous outlines or due to some noisy structures in the original cytoplasm images, and help in getting closed contour outlines. Decision on whether to join these non-connected branches or not is taken on the basis of object correspondence at the nuclei and cytoplasm level. In order to find the correspondence, the thinned outlines are applied on the initially segmented images to get the first-stage cytoplasm segmentation. Due to false positives and false negatives in the outlines classification we get over- and under-segmentation. To deal with this, nuclei images are used to perform an additional step of splitting and merging.
In the splitting and merging step, firstly, nuclei image is used to morphologically reconstruct the first-stage cytoplasm segmentation image. This separates objects or cytoplasmic regions with and without a corresponding nucleus. The latter ones are saved to be merged in a later part of this step. In the former case, we have two types of correspondences: one-to-one correspondence between cytoplasm and nucleus and one-to-many correspondence between cytoplasm and nuclei. In the former case, there is one nucleus for every cytoplasm which is often the case in our images as there are very few multinuclear cell phenotypes. Morphological closing is applied to such objects to smoothen inside of cytoplasm and to remove any non-connected branches occurring due to noise or intensity inhomogeneities.
In the case of one-to-many correspondence, the respective non-connected branches in outline are extracted and dilated to close in the gaps. Skeletonization and morphological reconstruction are applied again to split the regions into nucleus-bearing regions and non-nucleus-bearing regions. It is worth-mentioning that no extra splitting approach is used in order to get one cytoplasmic region per nucleic region. The reason is that the nuclei used for finding correspondence are themselves found to be affected by over-splitting and an attempt to forcefully split a cytoplasmic region despite the absence of outline would result in cytoplasm over-segmentation translated from nuclei over-segmentation. Moreover, our approach also helps in retaining the morphology of the multinuclear cell phenotypes.
Results and discussion
To study and analyze the performance of our segmentation methodology, we test it against two challenging test cases. Both of them consist of image sets of different cell types with cells of varying size, shape, texture and degree of overlap. The first case is challenging in the sense that it contains images with high cell density with large variation in the shape as well as in size of the cells. The second test case is more of a validation case because it not only contains images from publicly available dataset with ground truth benchmarking, but it also presents an altogether different set of images from the first test case. This enables testing the generalization of our framework. The challenging aspect of the second case, similarly as for the first case, is that the cells are such tightly clustered with virtually no indiscernible boundaries that even accurate manual segmentation is sometimes impossible. Moreover, in both the cases, the extensive variation in signal strength, intensity inhomogeneity and low contrast make the segmentation task even more challenging.
The details about the experimental settings to perform image acquisition for compiling the dataset for Test Case I and Test Case II are given below.
Test Case I
Experiments were conducted in a 384-well plate format imaging HeLa CCL-2 ATCC cells using Molecular Devices ImageXpress microscopes (10× objective; 9 sites per well, Channels DAPI: DNA, GFP: pathogen, RFP: actin) with robotic plate handling. The objective was 10X S Fluor. Image binning was not used. Gain was set to low (Gain1). Laser-based focusing was enabled and image-based focusing was disabled. The dynamic range was set to 12 Bit Range. Z-Offset for Focus was selected manually and AutoExpose was used to get a good exposure time. Manual correction of the exposure time was applied to ensure a good dynamic range with low overexposure, when necessary. The size of each image is 1040× 1392 pixels. Manual benchmark creation was performed by biologists where cell cytoplasm outlines are drawn. Due to the presence of multinuclear phenotypes, there are few cases of multiple nucleus per cytoplasm. Five images containing around 550 cells were taken which were representative of most of the problematic cases not solved well by a widely used method from .
Test Case II
In this test case we use images of Drosophila melanogaster Kc167 cells which were stained for DNA (nuclei) and actin (cytoplasm). "Images were acquired using a motorized Zeiss Axioplan 2 and a Axiocam MRm camera, and are provided courtesy of the laboratory of David Sabatini at the Whitehead Institute for Biomedical Research. First, nuclei were outlined by hand. The nuclear outlines were overlaid on the cell images, and one cell per nucleus was outlined" . There are 16 images in the dataset with size 400× 400, 450× 450 and 512× 512 pixels, containing cells of around 25 pixels in diameter with an average of 80 cells per image. The motivation for using this image set primarily comes from its public availability and benchmarking. Also, these images provide challenging segmentation tasks which have also been worked upon previously, such as in [16, 34, 35]. This helps in examining the proposed method in comparison to the results obtained from these state-of-the-art methods.
Segmentation quality metrics
where TP, FP and FN are true positive, false positive and false negative, respectively, with respect to the benchmarked images. The higher the rate of true values, the lower the rate of false values and the higher would be the segmentation accuracy.
Pixel-level measures give an insight into how accurate the obtained segmentation is, in terms of correspondence between cells in segmented image and benchmarked image. For each cell in the benchmarked image, based on maximum overlap, a corresponding cell was found in the segmented image. TP, FP and FN values were obtained at pixel-level and FM value was obtained. In order for correspondence to be true, a threshold value of FM th = 0.6 was used as it was used in . Once an object correspondence is found, the object was removed from the segmented image and was not considered for any other object in the benchmarked image. In this way, only one-to-one (TP), one-to-none (FN) or none-to-one (FP) correspondence was obtained between the benchmarked image and the segmented image. This also accounted for the object-level measure for cytoplasms, that is, every one-to-one correspondence meant an increase in cell count. Object-level measures for the nuclei were also obtained in a similar way to get the nuclei count.
It is worth-mentioning that while finding correspondence for cytoplasms, the nuclei image was not used at all. The reason is that an over-splitting at nuclei level may not always cause over-splitting at cytoplasm level due to true absence of outline. Therefore, using nuclei for finding correspondence may result in wrong quantitative measures.
In both cases, nuclei segmentation was obtained by using our framework presented in . However, in that framework we used graph cut segmentation method from  which can be replaced with the initial segmentation method proposed here for cytoplasm segmentation. From the results, it has been observed that although the nuclei segmentation framework with our proposed initial segmentation gives less smoother result than the framework with graph cut segmentation but when compared quantitatively it was able to reduce twice as many false negatives as it increases false positives. The reason is that our initial segmentation method was found to be better in detecting objects in low contrast with varying signal strength than graph cut method, even though the applied pre-processing was the same. Although, the final F-measure value was almost similar in either case, the decrease in false negative meant an increase in cytoplasm detection, whereas, a false positive might not be as costly since nuclei image is not affecting the splitting of cytoplasm regions as long as there is no underlying outline detected. For Test Case II, we replaced the graph cut-based initial segmentation of the framework in  with the initial segmentation method proposed here. As the magnification of these images is different from our images, that is, they have lesser pixels per nucleus, the set of values used for scale needs to decrease in order to avoid objects from getting connected due to larger kernel width. Therefore, Gaussian filtering was performed with smaller kernel width. Hence, the scale-space representation was composed of 7 images obtained at scales t = [0, 0.5, 1, 1.5, 2, 2.5, 3] corresponding to the original image to get the initial segmentation as described in cell cytoplasm segmentation subsection.
Results and discussion
Quantitative values obtained from nuclei and cytoplasm segmentation for Test Case I (See text for abbreviations).
Quantitative values obtained from our segmentation method for Test Case II (See text for abbreviations).
For the Test Case I, we have nuclei and cytoplasm segmentation results obtained from CellProfiler 1.0 (CP) implementation . Table 2 also lists the values obtained from them. As we discussed about nuclei segmentation in , CP gives low value for FN , but at the expense of high value for FP . This high value of FP at nuclei level got translated into an even higher value at the cytoplasm level. This is because cytoplasm segmentation was purely based on nuclei segmentation and, effectively, one cytoplasmic region was found for every nucleic region. This difference in values for nuclei and cytoplasm segmentation is more due to FM th value of 0.6 for cytoplasm detection. Since every over-splitting at nuclei level leads to over-splitting of cytoplasm which, most of the time, disqualifies all the cytoplasmic regions corresponding to an over-split nucleus. This is also evident from Table 2 that FP for cytoplasm became almost twice of FP for nuclei and those extra FP also affect the FN directly. Finally, the value of FM for CP cytoplasm segmentation came out to be 0.84.
As we mentioned earlier, our proposed cytoplasm segmentation mainly needs a low FN for nuclei segmentation because, due to cytoplasm-nuclei correspondence-based segmentation, cytoplasms for which nuclei are not detected are merged with other cytoplasms. Although, the FM values for CP implementation and our nuclei segmentation do not differ much, the detection error FP + FN for our method was 21, which is less than half as compared to 49 for CP implementation.
Finally, it is evident from the obtained qualitative as well as quantitative results for both the test cases that the proposed method was able to produce accurate results, see Table 2, 3 and Figure 5, Figure 6. Moreover, considering that both the test cases provide completely different set of images with different challenges, the obtained results also demonstrate the generic nature of our framework. In the end, it is worth-mentioning that even though the method uses manually outlined images for training the classifier, it does not depend on user-defined parameters for segmentation.
In this article we present a novel approach for cell segmentation. The proposed method uses a new combination of pre-processing methods for enhancing the contrast of cell cytoplasm and especially their boundaries by applying coefficient of variation for a multi-scale Gaussian representation of the input image. The enhanced image is used as a basis of feature extraction process, where filtering, texture operations and other generic descriptors are applied for building a large set of features to be used for building a classifier model for cell outline detection. By applying the logistic regression classifier, known to produce sparse models where only a subset of the initial features are used, a rather simple model with a small set of features is obtained, making the classification process computationally feasible. Finally, in post-processing phase, cell nuclei segmentation is used to aid the construction of final cell outlines from the classification output.
In order to validate the segmentation method, we used two image sets with different characteristics. The quantitative results confirm that the method performs consistently for the two datasets and when compared to a widely used method and values presented in literature, it can be concluded that our results are very promising; either improving or matching the results of earlier presented methods.
In conclusion, we expect that learning based methods may be useful in challenging segmentation tasks, such as in high content screening where low contrast cells should be accurately segmented in order to maintain high accuracy among challenging phenotypes. The labeled training samples, in this context: manually outlined cells in a set of images, is a fundamental requirement for using a supervised segmentation method. In high content screening the amount of image data is huge and since also the validation is in most cases done against manually segmented images, we feel that the gain in performance should justify the task of creating the training data.
We acknowledge the financial support from TISE graduate school and Nokia Foundation (MF), Academy of Finland project #140052 (Pekka R), and grant 51RT-0-126008 (InfectX) in the frame of SystemsX.ch, the Swiss Initiative for Systems Biology (to Christoph D). We are also very grateful to the biologists at Biozentrum, Dr. Simone Eicher and Dr. Houchaima Ben Tekaya, for providing benchmark images of cell cytoplasm outlines.
The funding for publication of the article comes from the aforementioned projects.
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 10, 2013: Selected articles from the 10th International Workshop on Computational Systems Biology (WCSB) 2013: Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S10
- Mocellin S, Provenzano M: RNA interference: learning gene knock-down from cell physiology. Journal of Translational Medicine. 2004, 2:
- Agaisse H, Burrack L, Philips J, Rubin E, Perrimon N, Higgins D: Genome-wide RNAi screen for host factors required for intracellular bacterial infection. Science. 2005, 309 (5738): 1248-1251. 10.1126/science.1116008.View ArticlePubMed
- Wollman R, Stuurman N: High throughput microscopy: from raw images to discoveries. Journal of Cell Science. 2007, 120 (21): 3715-3722. 10.1242/jcs.013623.View ArticlePubMed
- Yan P, Zhou X, Shah M, Wong S: Automatic segmentation of high-throughput RNAi fluorescent cellular images. IEEE Transactions on Information Technology in Biomedicine. 2008, 12: 109-117.PubMed CentralView ArticlePubMed
- Chen C, Li H, Zhou X, Wong S: Constraint factor graph cut based active contour method for automated cellular image segmentation in RNAi screening. Microscopy. 2008, 230 (2): 177-191. 10.1111/j.1365-2818.2008.01974.x.View Article
- Farhan M, Ruusuvuori P, Emmenlauer M, Rämö P, Dehio C, Yli-Harja O: Graph cut and image intensity-based splitting improves nuclei segmentation in high-content screening. Proc SPIE 8655, Image Processing: Algorithms and Systems XI. 2013
- Wählby C, Lindblad J, Vondrus M, Bengtsson E, Björkesten L: Algorithms for cytoplasm segmentation of fluorescence labelled cells. Analytical Cellular Pathology. 2002, 24 (2-3): 101-111.View ArticlePubMed
- Held C, Palmisano R, Häberley L, Hensel M, Wittenberg T: Comparison of parameter-adapted segmentation methods for fluorescence micrographs. Cytometry, Part A. 2011, 79 (11): 933-945.View Article
- Lindblad J, Wählby C, Bengtsson E, Zaltsman A: Image analysis for automatic segmentation of cytoplasms and classification of Rac1 activation. Cytometry, Part A. 2004, 57: 22-33.View Article
- Garrido A, de la Blanca NP: Applying deformable templates for cell image segmentation. Pattern Recognition. 2000, 33 (5): 821-832. 10.1016/S0031-3203(99)00091-6.View Article
- Brox T, Weickert J: Level set based image segmentation with multiple regions. Pattern Recognition, Springer LNCS. 2004, 3175: 415-423. 10.1007/978-3-540-28649-3_51.
- Vese L, Chan T: A multiphase level set framework for image segmentation using the Mumford and Shah model. Computer Vision. 2002, 50 (3): 271-293. 10.1023/A:1020874308076.View Article
- Allalou A, van de Rijke F, Tafrechi R, Raap A, Wählby C: Image based measurements of single cell mtDNA mutation load. Springer LNCS. 2007, 4522: 631-640.
- Leskó M, Kato Z, Nagi A, Gombos I, Török Z, Vigh L, Vigh L: Live cell segmentation in fluorescence microscopy via graph cut. Proc IEEE International Conference on Pattern Recognition. 2010, 1485-1488.
- Russel C, Metaxas D, Restif C, Torr P: Using the Pn potts model with learning methods to segment live cell images. Proc IEEE International Conference on Computer Vision. 2007, 1-8.
- Quelhas P, Marcuzzo M, Mendonça AM, Campilho A: Cell nuclei and cytoplasm joint segmentation using the sliding band filter. IEEE Transactions on Medical Imaging. 2010, 29 (8): 1463-1473.View ArticlePubMed
- Brejl M, Sonka M: Edge based image segmentation: machine learning from examples. Proc IEEE International conference on Neural Networks. IEEE World congress on computational intelligence. 1998, 814-819.
- Prasad M, Zisserman A, Fitzgibbon A, Kumar MP, Torr PHS: Learning class-specific edges for object detection and segmentation. Proc Indian conference on Computer Vision, Graphics and Image Processing. 2006, 94-105.View Article
- Keuper M, Bensch R, Voigt K, Dovzhenko A, Palme K, Burkhardt H, Ronneberger O: Semi-supervised learning of edge filters for volumetric image segmentation. DAGM-Symposium. 2010, 462-471.
- Brejl M, Sonka M: Object localization and border detection criteria design in edge-based image segmentation: automated learning from examples. IEEE Transactions on Medical Imaging. 2000, 19 (10): 973-985. 10.1109/42.887613.View ArticlePubMed
- Zuiderveld K: Contrast limited adaptive histogram equalization. 1994, Graphic Gems IV San Diego: Academic Press ProfessionalView Article
- Russ JC: The image processing handbook. 1999, CRC Press, 3
- Selinummi J, Ruusuvuori P, Podolsky I, Ozinsky A, Gold E, Yli-Harja O, Aderem A, Shmulevich I: Bright field microscopy as an alternative to whole cell fluorescence in automated analysis of macrophage images. PLoS One. 2009, 4 (10): e7497-10.1371/journal.pone.0007497.PubMed CentralView ArticlePubMed
- Lindeberg T: Scale-space theory: a basic tool for analysing structures at different scales. Journal of Applied Statistics, Supplement Advances in Applied Statistics: Statistics and Images: 2. 1994, 21 (2): 225-270.
- Otsu N: A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics. 1979, 9: 62-66.View Article
- Ruusuvuori P, Manninen T, Huttunen H: Image segmentation using sparse logistic regression with spatial prior. Proc IEEE European Signal Processing Conference. 2012, 2253-2257.
- Ojala T, Pietikäinen M, Mäenpää T: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24 (7): 971-987. 10.1109/TPAMI.2002.1017623.View Article
- Jain AK, Farrokhnia F: Unsupervised texture segmentation using Gabor filters. Proc IEEE International conference on Systems, Man and Cybernetics. 1990, 14-19.
- Haralick RM, Shanmugam K, Dinstein I: Textural features for image classification. IEEE Transactions on Systems, Man and Cybernetics. 1973, 3 (6): 610-621.View Article
- Tibshirani R: Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B. 1994, 58: 267-288.
- Friedman JH, Hastie T, Tibshirani R: Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software. 2010, 33: 1-22.PubMed CentralView ArticlePubMed
- Carpenter AE, Jones TR, Lamprecht MR, Clarke C, Kang IH, Friman O, Guertin DA, Chang JH, Lindquist RA, Moffat J, Golland P, Sabatini DM: CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biology. 2006, 7 (10): R100-10.1186/gb-2006-7-10-r100.PubMed CentralView ArticlePubMed
- Broad Bioimage Benchmark Collection Website. 2013, [http://www.broadinstitute.org/bbbc/BBBC007/]
- Xiong G, Zhou X, Ji L, Bradley P, Perrimon N, Wong S: Segmentation of drosophila RNAi fluorescence images using level sets. Proc IEEE International Conference on Image Processing. 2006, 73-76.
- Jones TR, Carpenter A, Golland P: Voronoi-based segmentation of cells on image manifolds. ICCV Workshop on Computer Vision for Biomedical Image Applications. 2005, 535-543.View Article
- Farhan M, Yli-Harja O, Niemistö A: A novel method for splitting clumps of convex objects incorporating image intensity and using rectangular window-based concavity point-pair search. Pattern Recognition. 2013, 46: 741-751. 10.1016/j.patcog.2012.09.008.View Article
- Danek O, Matula P, de Solorzano CO, Munoz-Barrutia A, Maska M, Kozubek M: Segmentation of touching cell nuclei using a two-stage graph cut model. Springer LNCS. 2009, 5575: 410-419.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.