Skip to main content

Efficient automatic 3D segmentation of cell nuclei for high-content screening

Abstract

Background

High-content screening (HCS) is a pre-clinical approach for the assessment of drug efficacy. On modern platforms, it involves fluorescent image capture using three-dimensional (3D) scanning microscopy. Segmentation of cell nuclei in 3D images is an essential prerequisite to quantify captured fluorescence in cells for screening. However, this segmentation is challenging due to variabilities in cell confluency, drug-induced alterations in cell morphology, and gradual degradation of fluorescence with the depth of scanning. Despite advances in algorithms for segmenting nuclei for HCS, robust 3D methods that are insensitive to these conditions are still lacking.

Results

We have developed an algorithm which first generates a 3D nuclear mask in the original images. Next, an iterative 3D marker-controlled watershed segmentation is applied to downsized images to segment adjacent nuclei under the mask. In the last step, borders of segmented nuclei are adjusted in the original images based on local nucleus and background intensities. The method was developed using a set of 10 3D images. Extensive tests on a separate set of 27 3D images containing 2,367 nuclei demonstrated that our method, in comparison with 6 reference methods, achieved the highest precision (PRĀ =Ā 0.97), recall (REĀ =Ā 0.88) and F1-score (F1Ā =Ā 0.93) of nuclei detection. The Jaccard index (JIĀ =Ā 0.83), which reflects the accuracy of nuclei delineation, was similar to that yielded by all reference approaches. Our method was on average more than twice as fast as the reference method that produced the best results. Additional tests carried out on three stacked 3D images comprising heterogenous nuclei yielded average PRĀ =Ā 0.96, REĀ =Ā 0.84, F1Ā =Ā 0.89, and JIĀ =Ā 0.80.

Conclusions

The high-performance metrics yielded by the proposed approach suggest that it can be used to reliably delineate nuclei in 3D images of monolayered and stacked cells exposed to cytotoxic drugs.

Peer Review reports

Background

High-content screening (HCS) platforms are currently state-of the art approaches for drug discovery and drug efficacy assessment. Examining the drug response on HCS platforms necessitates the quantification of fluorescent signals captured by automated microscopy systems in single cells. To enable quantification of the fluorescence in whole cells without losing the spatial information, modern screening platforms employ high-resolution, three-dimensional (3D) image capture. Segmentation of cell nuclei in 3D images is an essential prerequisite when quantifying the captured fluorescence in whole cells or sub-cellular compartments. However, this segmentation is challenging due to variabilities in cell confluency and morphological characteristics induced by chemotherapeutics, which are often highly cytotoxic. Another challenge that is commonly seen in acquired images is the gradual loss of fluorescence intensity and contrast with the depth of scanning. Despite advances in algorithms for segmenting nuclei in HCS, robust 3D methods that are insensitive to changes in size, shape, texture and staining intensity of nuclei induced by harsh experimental conditions are still lacking. Most of the existing methods can cope with some but not all of these issues. Others are slow and computationally expensive and therefore have limited applicability to HCS tasks.

3D HCS screening is a powerful tool for measuring drug response and assessment of cytotoxicity and cell viability [1,2,3,4,5]. Modern HCS platforms deliver 3D image data in a multiplex format with stacks of optical sections (z-stacks) organized into channelsā€”one for each fluorophore. Based on the nuclear masks segmented from a 3D image, HCS can count and perform morphological and functional profiling of cells. A relatively large number of software platforms are available to generate measurements for HCS studies [6,7,8,9,10]. However, the problem is that different methods yield different count values, and the quality of delineations of 3D masks of nuclei vary when tested in images supplemented with ground truth.

To achieve reliable segmentation of nuclei, numerous semi-automated segmentation techniques have been developed [7, 8, 11,12,13]. However, many of them are tailored to a specific screening task or require prior setting of multiple parameters [3, 14]. Many fully automated methods are frequently restricted by the morphological characteristics of specimens and are thus able to analyse cells with less-complex patterns. Unlike in high-magnification images, where fine chromatin details can be quantified, nuclei in low-magnification images appear round and often uniformly intense without any visible chromatin texture. Yet, as image magnification increases, so does the level of visual detail in cells, which makes nuclei segmentation more challenging and prone to errors.

Existing 3D nuclei segmentation techniques can be divided into three main families: watershed-based, deformable model-based, and level-set-based segmentations [15,16,17,18]. Watershed-based approaches involve markers (seeds) and are computationally efficient. However, pipelines which use watershed-based segmentations frequently require pre- and post-processing procedures to deal with possible over- and under-segmentations [19,20,21]. The performance of watershed-based methods is often determined by the performance of the seed-generation technique that is run prior to the actual segmentation. For instance, one such seed-generation method was proposed in [11]: shape priors derived directly from a non-segmented image were funnelled as features ahead of the actual nuclear segmentation routines. In [11], the output of a 3D radial symmetry transform was used to approximate the location and shape of nuclei. Watershed-based pipelines can be improved by reducing the number of processing steps and hyperparameters that govern the segmentation [7, 8]. In [16], the authors presented an algorithm for multi-cell segmentation and tracking. This method was based on the coupled active surfaces framework. Connected objects determined in the first image frame are segmented with a level-set function. One level-set is assigned to each object. Each level-set function is iteratively evolved until convergence criteria are satisfied. It the next step, watersheds are used to perform rough splitting of level-set functions in connected components. The algorithm determines whether existing level-set functions need to be terminated or new functions should be introduced. In the final step, the Radon transform is used to separate the level-set functions of closely positioned nuclei.

Approaches that are based on deformable models require less pre- and/or post-processing but are more computationally intense, especially for 3D applications. Level-set-based segmentations utilize geometric active contours [22,23,24]. To track shapes, they connect pixels that have similar intensity, and they yield a 3D surface or a 2D contour that delineates individual nuclei. Except for a coupling constraint that inhibits overlapping of neighbouring contours, level-set methods do not require explicit parameterization and can yield masks for objects with complex shapes. However, they are computationally expensive because each object (nucleus) has to be represented by a unique level-set function. These requirements increase the computational burden and thus often prohibit the use in HCS applications of level-set methods for specimens with high cell confluency. In [25], the authors proposed a software system named FARSIGHT, which was used for segmentation of various types of three-dimensional (3D) bioimages. It utilized the graph cuts method to separate objects from the background, and a clustering algorithm was used to generate a mask of seeds employed in the seeded watershed segmentation.

Another method [26] has been used to separate clustered nuclei from fluorescence microscopy cellular images. The authors proposed shape markers and a marking function in a watershed-like algorithm. Geometric active contours were used to initially segment 2D images; then, an adaptive H-minima-based algorithm was used to find shape markers that served as seeds for watershed-based splitting of closely spaced nuclei. This method was adapted to process 3D images.

Deep-leaning (DL) based methods for 3D segmentation of cell nuclei are the most recent. They utilize artificial neural network models with convolutional and feature extraction layers that perform semantic segmentation of the volumetric image data on the network input and yield a 3D mask of nuclei on its output. Their main advantage is the automatic extraction of learnable image features that the model automatically distils based on the training data in order to distinguish the nucleus from background and separate one nucleus from another in aggregates [27,28,29,30]. These approaches utilize variants of the 3D-UNet [31] or Fully Convolutional Neural Network (FCNN) architectures [32] in the modelā€™s backbone. Although the DL approaches for 3D nuclei segmentation perform well on images of highly confluent cells, they achieve higher segmentation accuracy when the nuclear mask or probability scores of the detected nuclei at the model output is converted to a set of markers (seeds) for use with watershed segmentation which completes the segmentation. Examples of this two-step approach can be found in [27, 28, 30]. To learn robust features, the 3D DL models need a significant amount of training data, that is images with every nucleus delineated in 3D. However, this process is time consuming and costly because delineations are often generated manually.

We have developed a new algorithm to robustly delineate and separate nuclei in 3D images of cellular specimens. The algorithm was tested on image data from experiments that tested DNA demethylating drug screening on human cells. It is based on voxel intensity thresholding to generate a preliminary nuclear mask; this is followed by iterative 3D marker-controlled watershed segmentation to separate the nuclei of adjacent cells. In the last step, borders of segmented nuclei are adjusted based on local nucleus and background intensities. This approach makes it robust against local changes in image contrast and intensity. This method is generally dedicated to segmentation of cell nuclei in experiments with stacked and closely adjacent cells grown on plates in wells that are stained and subsequently imaged in 3D. In comparison to the existing methods, the proposed approach requires only a small set of shape priors. Our approach was developed and tested on high-resolution, 3D, confocal images of human-derived cells that had been exposed to anticancer drugs. The nuclei delineation performance of the proposed method was compared to state-of-the-art methods, including one previously developed by the authors [11]. Our method was also compared with two DL techniques. We chose QCANet [27] and 3DCellTracker [28] because their software frameworks were made publicly available. In order to apply these methods, our image data were respectively adjusted. To compare results of segmentations by these methods, our images were downsized (to achieve similar sizes of voxels of the image data used in training) and normalized before segmentation. In case of referenced methods, the segmentation was carried out with default set of parameters for each method.

Results

The aim of the research was to develop a fast and effective nuclei segmentation method for 3D cell specimens. The proposed segmentation method is divided into three main stages. First, the algorithm analyses the entire 3D image matrix, \(Im_{3D}\), in order to determine the statistical parameters of the image and eliminate the background. This stage is performed on a downsampled image. The second stage involves segmentation, which leads to accurate separation of individual nuclei. Finally, the algorithm reconstructs the cells and then upsamples the obtained 3D masks to the original image resolution. FigureĀ 1 shows a detailed block diagram of the developed segmentation algorithm. Initially, the analysed 3D image, \(Im_{3D}\), is reduced by 50% in the XY axes; the Z axis remains unchanged. As a result, the memory requirements and the time needed for image analysis are reduced by up to 4 times. For the reduced matrix of the 3D image (after scaling, \(Im_{3Dsc}\)), the maximum intensity of all image voxels, \(I_{Max}\), is determined. The 3D image matrix, \(Im_{3Dsc}\), is then converted to a set of sub-matrices that have a size of \(3{\times }3{\times }3\) voxels. Average intensity is determined for each of these sub-matrices. Next, based on an automatically determined initial global threshold, they are classified as the background or nucleus. This value is calculated by combining the results of two selected global thresholding methods (see the ā€œMethodsā€ section for details). The thus-classified sub-matrices containing nuclear voxels are recorded in the resulting matrix, \(M_{Result}\), and are analysed for single nuclei or group membership. The groups of nuclei are then divided (into single nuclei) and the precise shapes of single segmented nuclei are determined. Each group of voxels is analysed for shape and size and is classified as either a single nucleus (\(NC_{S}\)) or a group of nuclei (\(NC_{G}\)). All \(NC_{G}\) voxels are re-segmented with the same algorithm but with a higher threshold until further splitting of the cell group is impossible and the nuclei are classified as \(NC_{S}\).

Fig. 1
figure 1

Block diagram of the cell nuclei segmentation algorithm and its individual stages

The end result of the algorithmā€™s operation is a 3D mask of nuclei, \(NC_{S}\), in the examined 3D image. The separated nuclei are marked with indexes, each containing information about the complete voxel list and nuclear size. The last stage involves image upsampling and mask adjustment of the nuclear surface. The segmented nuclei can be used for further analysis and quantification of specimens. The individual steps of the process are described in detail in the ā€œMethodsā€ section.

All key parameters used in the segmentation (nuclei sphericity, solidity, volume ) and cell nuclei splitting (nuclei size after distance transform) were established empirically based on the training set (10 images with 520 nuclei). The segmentation accuracy in the test set (JI and F1-Score), related to changes of parameters were shown in the Tables 1, 2, 3 and 4.

Table 1 Estimation final value of \(NC_{SpherThr}\)
Table 2 Estimation final value of \(NC_{SolidThr}\)
Table 3 Estimation final value of \(NC_{VolMin}\)
Table 4 Estimation final factor value of \(dist_{MinTh}\)

Examples of cell segmentation results

FigureĀ 2 below visualizes the results of segmentation of selected specimens using the above-described method. The cases in Fig.Ā 2aā€“d are the specimens from Fig.Ā 6aā€“d, respectively. The other cases are the most interesting images from the examined set.

Fig. 2
figure 2

Segmentation process, comprehensive sample results. The cases in aā€“d are the specimens from Fig. 6 aā€“d, respectively.

When analysing the images, it can be observed that despite the different visibility of cells in the individual layers of the 3D image (cases from Fig.Ā 6), the different confluency of specimens, and the different sizes of single nuclei, the algorithm eliminates the background completely and correctly segments individual cells.

Segmentation of 3D stacks

As the tested 3D images represented monolayer specimens, additional tests were carried out to check the effectiveness of the proposed method in specimens with a greater number of cells and optical layers. For this purpose, we montaged three 3D stacks: a) \(S_1\), comprising two 3D images (114 layersā€”#1, #2); \(S_2\), comprising two other 3D images (73 layersā€”#11,#12); and \(S_3\), comprising the \(S_2\) stack and two additional 3D images (153 layers in totalā€”#10ā€“#13) (\(S_3\)). The stacks were merged along the Z-axis. FigureĀ 3 shows the results of 3D segmentation of such combined 3D stacks. It can be seen that despite the high cell confluency, the differentiated arrangement of cells in the layers, and the different parameters of the 3D images, the segmentation proceeds in a similar way to single monolayer images. Background elimination is as effective as in the case of monolayer specimens, and the clusters of nuclei of specimens with different parameters do not interfere with the segmentation of the entire 3D stack (Figs. 3, 4).

Fig. 3
figure 3

Segmentation results for stacks S1 - a, S2 - b and S3 - cā€“e. Each stack is a result of concatenation of selected stacks from the test set

Fig. 4
figure 4

Accuracy in 3D stacks

Table 5 below compares the results of analysis of individual (3D image) specimens with the results obtained after combining them in the Z-axis (3D stack). The accuracy metrics for the analogous layers in the case of a 3D image and a 3D stack are similar. The slight differences in these metrics that we observed may be due to differences in the global image characteristics in the 3D images vs. the 3D stacks comprised of these 3D images. These differences result from the different capture conditions of individual images, the different cell types in the comprising 3D images, or the confocal microscope settings during image acquisition. When combining different 3D images in the examined artificial 3D stack, very bright (from one component image) and darker (from the other component image) cells appeared; this is especially noticeable in the case of the greater decrease in effectiveness for image idĀ =Ā 13, which was combined into a 3D stack from 4 real 3D images (153 layersā€”\(S_3\)) (Table 5, example 3). As the algorithm tries to automatically determine the operating parameters based on the statistical data of the entire 3D image, in the case of artificial ā€œmismatchedā€ images, the obtained results may be slightly worse, although the values in the real images are better than for the reference methods.

Table 5 Results of segmentation of the artificial 3D stack specimens from Fig.Ā 3

Comparison with other methods

Table 6 summarises the results for the all compared methods. These are the average results obtained from the entire set of 27 3D images. It is worth noting that the proposed method achieves better results for almost all of the measured parameters. The LSetCelTrk method [16] achieves a lower JI, but it works many times faster. Farsight [25] is slightly slower but the results are much better compared to method [11], which yielded the best results among the reference methods.

To perform nuclei segmentation with QCANet, voxel dimensions in our images were adjusted to those in images the QCANet was trained with. Since voxels in the QCANet training images measured either 0.8Ā \(\upmu\)mĀ \(\times\)Ā 0.8Ā \(\upmu\)mĀ \(\times\)Ā 1.75Ā \(\upmu\)m or 0.8Ā \(\upmu\)mĀ \(\times\)0.8Ā \(\mu\)mĀ \(\times\)Ā 2.00Ā \(\upmu\)m, a arbitrary downsizing factor of 0.15x was applied to X-Y planes. After voxel adjustment, nuclei in our images had approximately the same Xā€“Yā€“Z sizes as those in the QCANet training images. The downsized stacks were segmented by the QCANet and the returned 3D nuclear masks subsequently upscaled to match the original voxel dimensions and to compare with ground truth. To evaluate 3DCellTracker, Xā€“Y planes in our original stacks were downscaled to 315Ā \(\times\)Ā 315 pixels in order to match the pixel size of images used to train the 3DCellTracker. The specification of 3DCellTracker requires also that the input stack comprises 21 layers. Hence,we further interpolated the input stack along Z axis to generate the 21 layers for sequential processing by the 3DCellTracker. The 3D nuclear mask reconstructed from the outputted 2D masks was then upscaled back to the original resolution to measure nuclei segmentation accuracy against the ground truth.

When tested on our data, the QCANet DL model [27] separated properly all nuclei. The DL model from the 3DCellTracker [28] performed similarly to QCANet. Its average accuracy was lower in high confluency preparations when compared to preparations with low or moderate cell confluency. To segment nuclei, QCANet required GPU, whereas the model from the 3DCellTracker utilized CPU.

The proposed algorithm obtains better results and is also faster than most of the referenced methods. The discussed algorithm does not require any user intervention in order to select or tune the operating parameters as they are determined automatically on the basis of statistical information obtained from the tested 3D images. Table 7 additionally presents the comparison of segmentation results, taking into account the specimen confluency. Segmentation results of tested DL methods are presented in Fig.Ā 15. The results of other methods are shown in the article [11].

Table 6 Comparison with other methods (mean values in the test set)
Table 7 Comparison with other methods (with respect to specimen confluency)

Discussion

In recent years, the number of 3D cell image segmentation methods has been growing rapidly. However, the efficiency and precision of delineating cells and their shapes continues to be a big challenge, especially in direct 3D analysis. High segmentation speed with sufficiently high efficiency is difficult to achieve and requires further development of new methods. Another obstacle is the large diversity of tested specimens. The proposed method, owing to the initial image scaling and the final reconstruction of the segmented cells to the real resolution, makes it possible to maintain high efficiency whilst also reducing the analysis time. Compared to other methods, it combines the global approach with the local one. At the initial stage of the analysis, it relies on the global characteristics of the image to eliminate the background; then, after obtaining the approximate shapes of the cells, it uses local parameters for the separation of cell groups and final shape adjustment. The most important element in the first stage is the method of background elimination and the initial segmentation of cell nuclei that was proposed by the authors on the basis of the combination of variable global and local thresholds (for the determination of various cluster of nuclei) and the shape parameters of nuclei in 3D cluster of nuclei. The algorithmā€™s operating parameters are determined automatically on the basis of statistical information obtained from the examined 3D image. In the second stage, the most important element among the proposed solutions is the method of generating seeds of single cell nuclei and splitting groups of cells into single cells. Combining the results obtained from the modified 3D distance transform and the 3D seed watershed algorithm allows additional verification of the number of cells that result from splitting. The size range of normal nuclei is determined automatically based on the size of the examined 3D image and the size of the analysed clusters of nuclei subjected to segmentation. For faster operation, background elimination and cell group splitting are performed on images with reduced sizes in the X-axis and the Y-axis. Finally, the cell nuclei are reconstructed after segmentation. The final shape of each reconstructed cell is corrected at 1:1 scale, taking into account the information about the intensity level inside each detected cell. As a result, surface inaccuracies resulting from image scaling are significantly reduced. The developed method works very well for specimens of low, medium, and high confluency (Table 7) and is able to divide large clusters of nuclei consisting of several dozen adjacent cells (examples in Figs.Ā 6d and 12b, c). However, the average F1-score and JI achieved by the 3DCellTracker were much lower than those achieved by the QCANet because the 3DCellTracker performed poorly on in specimens with high confluency. Both DL models had overall worse performance than the proposed method (TablesĀ 6 and 7). We reason that the proposed method worked better than DL models because the DL models are sensitive to spherical object shapes that they were trained with. Our data however, includes more arbitrary nuclear shapes (spherical, ellipsoid and half-moon) arising from experimental conditions.

The further improvement in performance may be related to the acceleration of the final shape adjustment algorithm, which, with a large number of cells, has a visible impact on the total analysis time. At the present stage, this algorithm does not use parallel computations, which has a large impact on its duration. For better verification of its effectiveness, it seems reasonable to examine the segmentation effectiveness using ground-truth data prepared in the form of 3D masks.

Conclusions

The proposed method allows fully automatic segmentation of 3D cell specimens obtained from a confocal microscope. All processing steps are carried out in 3D, which makes it possible to eliminate the typical problems that occur during 2D segmentation (e.g. not taking into account information from adjacent layers). The obtained results, compared to other methods, demonstrate higher precision, specificity, F1-Score and a comparable JI (Table 6). In comparison with DL methods, the proposed method works faster (images were downsized to achieve similar voxels sizes), achieves better results in JI, F1-Score, precision and other parameters. The duration of the proposed method is comparable or shorter than that of the reference methods (and when the final shape adjustment block is excluded, it is additionally shortened at the expense of a slight reduction of the JI). The advantage of the method is greater speed and efficiency compared to other methods and the use of global and local features at various stages of the algorithmā€™s operation. The proposed algorithm can reliably segment nuclei in real and simulated large stacks of 3D images of cells with various degrees of cell confluency. Owing to the local capture of image characteristics and the multiscale nature of 3D image processing, our algorithm can satisfactorily cope with a variety of cell types, changes in nuclear morphology induced by cytotoxic drugs, and decreased fluorescence along the 3D stack.

Methods

Dataset

For this project, we reused a set of previously collected high-resolution image data. The method was developed using an independent set of 10 3D images (containing 521 nuclei). For testing we used a separate set of 27 (from #1 to #27) 3D images (with 2,367 nuclei). These images were acquired to investigate the expression of the 5-methylcytosine marker in cancer cell lines treated with chemotherapeutics that inhibit DNA methylation. The set includes image z-stacks of treated and untreated cells of DU145 human prostate carcinoma, and treated and untreated HuH-7 carcinoma of the liver fixed on glass slides. Besides the 5-methylcytosine marker, the cells were counterstained with 4ā€™,6-diamidino-2-phenylindole (DAPI)ā€”a common blue-fluorescent dye that binds to DNA. Staining was followed by imaging with a confocal laser-scanning microscope (TCS SP5 X Supercontinuum, Leica Microsystems Inc.). The imaging yielded z-stacks with 35ā€“50 high-resolution 1576 \(\times\) 1576px large serial optical sections with a voxel size of 120 nm \(\times\) 120 nm \(\times\) 250 nm (X-, Y-, and Z-axis) and 12 bit/px fluorescence intensity for each stain. DAPI and 5-methylcytosine signals were recorded in separate channels. For the numerical experiments described in this paper, we repurposed a set of 27 DAPI Z-stacks that represented DU145 cells and HuH-7 cells of low (up to 40 nuclei), moderate (41ā€“65 nuclei) and high confluency (73ā€“190 nuclei), and with a high variability of nuclear staining intensity and texture [11]. The mid-optical section in each stack is supplemented with ground-truth delineation to assess the segmentation performance. All 2367 nuclei in the whole set of 3D images were outlined manually by an expert.

The specimens were very diverse: some of them showed an inhomogeneous background and disturbances in the outermost layers, thus making it difficult to separate cells from the background (Fig.Ā 5bā€“d). FigureĀ 5a additionally shows the obscuration that appears in some specimens, which makes it difficult to delineate the precise shape of cells. The direct application of a local or global threshold value for the whole specimen and all cells nuclei does not make effective separation of cluster of nuclei possible in all tested specimens. Two types of artifacts can be distinguished in the tested specimens: local noise in individual image layers as a result of the imaging method and imaged cell types, Fig.Ā 5a; directional noise with a specific trend in the samples along the Z-axis, Fig.Ā 5b, and in layers that are not near the microscope objective, Fig.Ā 5c. In such cases, even manual setting of the background intensity threshold to simultaneously separate all cells from the background without affecting the nuclear shape to a great extent is difficult or even impossible. The proposed method allows these problems to be eliminated, as shown in Fig.Ā 5. This is possible because the threshold values are initially selected in two stages: globally for the entire specimen and then locally for the groups of cells, analysed separately. As the noise level in different areas of the specimen varies, the visibility of the cells is different. The second stage consists in performing proper 3D segmentation of the groups of cell nuclei that are formed after the background elimination. Since not every segmented cluster of nuclei is correct (i.e. it is not a single cell nucleus) and groups of adjacent cells nuclei appear in the image, it is necessary to divide such nuclei groups. The 3D watershed method, the analysis of the results from the modified 3D distance transform, as well as the 3D morphometric parameters of the cluster of nuclei were all used in the cell splitting procedure.

Fig. 5
figure 5

Examples of images with a manually selected brightness threshold, and the common problems that occur in the specimens. The cell confluency varies from low, through medium, to high (aā€“d)

The results of background elimination and cluster of nuclei segmentation for the examples from Fig.Ā 5 are presented in Fig.Ā 6. The selected examples show that cell confluency varies from low, through medium, to high (Fig.Ā 5aā€“d). It can be seen that the background was eliminated correctly in each case. In some cases, after the background elimination, the actual single nuclei of cells as well as groups of adjacent cells are immediately separated. In Fig.Ā 6d, after removing the background, a very large cluster of nuclei containing several dozen cells is formed and requires separation. Small areas of residual background (e.g. Fig.Ā 6a) will be removed at a later stage. Voxel clouds surrounding some cells (Fig.Ā 5a) are also removed in the background elimination step. FigureĀ 6a shows some selected groups of adjacent cells, the segmentation of which is discussed below and presented in Fig.Ā 11 in the Segmentation and separation of cell groups section.

Fig. 6
figure 6

Examples of background elimination and cluster of nuclei separation for specimens with different confluency (the cases from Fig. 5aā€“d, respectively)

FigureĀ 7 presents the above-discussed problems (related to several types of image disruption) using the profiles of the examined images. For the examples in Fig.Ā 5, oblique profiles were made, which show how the cells in successive layers stand out against the noise. In order to better present the level and type of noise in the images, the histogram was aligned. FigureĀ 7a shows the noise over the entire image area. In Fig.Ā 7b, noise appears in the vicinity of cells and is characterized by a trend similar to the profile direction. In the case of Fig.Ā 7c, the noise level increases again in the upper right part of the image. In Fig.Ā 7d), the noise is relatively low, and the cells are clearly visible. At the same time, it can be observed that the noise level in most of the presented images is not equal across the entire profile (Fig.Ā 7bā€“d).

Fig. 7
figure 7

Examples of layers with noise and oblique profiles for all layers in the specimens. The noise level for specimens from Fig. 5aā€“d), respectively

Background elimination and nucleus segmentation

The first step is to prepare the images for segmentation. Due to the large size (1576Ā \(\times\)Ā 1576 px) and the several dozen layers in the Z-axis, compared to the originals the size of the examined images was reduced by 50% (in the X and Y axes to 788Ā \(\times\)Ā 788 px) using the nearest neighbour method. The image size in the Z-axis remained unchanged. Initial scaling can reduce the analysis time and concentrate the nucleus voxels in the outermost layers, where their cluster of nuclei is significantly rarefied and may be poorly visible. Due to the large variation in the brightness and dynamics of images, scaling makes it possible to improve the visibility of cells by averaging the intensity values of voxels inside the cells. The thus-prepared images are then converted into a \(M_{3\times {3}\times {3}}\) sub-matrix sized 3\(\times\)3\(\times\)3 voxels. This procedure enables analysis of the voxel and its surroundings in the examined image. Subsequently, each \(M_{3\times {3}\times {3}}\) is classified as background or cellular structure (cell nucleus or a cluster of nuclei). This classification is based on a quick thresholding operation with an automatically determined global threshold. However, it proved to be impossible to establish a universal global threshold value that would effectively separate all cells from the background in all images. During the trials, it was observed that an experimentally selected threshold value of \(Th_{Int1}\)Ā =Ā 5% of the maximum intensity \(I_{Max}\) (from the 3D matrix of the \(Im_{3D}\) image) provided good background elimination and cell separation results in the examined images (in the case of 6 images, the nuclei of cells were correctly segmented in the first stage). For the other 21 images, this value was not optimal (too many groups of adjacent cells were formed) and further splitting was necessary. As a result of comparing the effectiveness of various threshold values (\(Th_{Int}\) from 5%, 10%, 15% to 20%), it was also established that a value of \(Th_{Int2}\)Ā =Ā 10% \(I_{Max}\) is the maximum value that allows effective separation of nuclei without destroying their structure, which would prevent further segmentation. Therefore, it was assumed that the threshold value used for background elimination should not exceed 10% of the maximum intensity in the image. In order to determine the initial value of the global threshold, \(Th_{Int1}\), for the classification of cell and background voxels (represented by \(M_{3\times {3}\times {3}}\) sub-matrices), several selected methods were tested, including our own, for which the algorithm yielded the best results. FigureĀ 8 shows the results of background elimination and binary mask delineation for a number of selected automatic threshold determination methods based on the image histogram. Complex methods that combine several basic methods were also tested in the noise elimination process, including anisotropic diffusion and background subtract rolling ball algorithms [33]. However, it turned out that their computational effort was significantly greater, and they had no influence on the obtained results. Therefore, we proposed our own method of determining the initial value of the global threshold, \(Th_{Int1}\), to eliminate noise and background. This method is based on the assumption that the noise in the examined images is within a certain range of values (Eq.Ā 1). On this basis, the image brightness range is determined in which the number of noise voxels is the highest. This method yielded very good results in most images; it effectively responds to the noise trends in the image by increasing the value of the threshold, \(Th_{IntM1}\). In the case of some images, it was impossible to further increase the threshold without damaging the nucleus structure. As a result of further tests, it was confirmed that the use of the triangle thresholding method [34, 35] made it possible in these cases to determine a greater threshold value without destroying the nucleus structure, \(Th_{IntM2}\), as shown in Fig.Ā 8.

$$\begin{aligned} V_{Th}(x,y,z)_{(i)}&= \left\{ \begin{array}{ll} 1 &{} \quad if\,\,Th_{(i)}-N_{Level}< Im_{3D}(x,y,z) < Th_{(i)}\\ 0 &{}\quad otherwise \end{array} \right. \\ VoxCount_{(i)}&= \sum V_{Th_{(i)}} \\ Th_{Int1M1}&=max (VoxCount_{(i)}) \end{aligned}$$
(1)
Fig. 8
figure 8

Method for determining the background threshold based on the image intensity histogram, and comparison of the results of different methods for the selected layer. The results for two selected specimens (a, b)

where:

  • \(N_{Level}\) (0.05) is the signal-to-noise level in the examined set of images [3, 36]

  • X, Y, Zā€”dimensions of the 3D image along X,Y,Z axes

  • x, y, zā€”voxel coordinates in 3D in the ranges \(x= 1,\ldots ,X\) , \(y= 1,\ldots ,Y\), \(z= 1,\ldots ,Z\)

  • \(Im_{3D}\)ā€”is the examined 3D image

  • \(Th_{(i)}\) \(\in\) {0.05 ...0.2} are the successive values of the threshold for which background voxels are counted

As a result, both methods were combined and a solution was adopted in which \(Th_{Int1}\) is determined as the maximization of the results of two automatic threshold determination methods (Eq.Ā 2), i.e. the method proposed by the authors (Eq.Ā 1) and triangle thresholding [34, 35].

$$\begin{aligned} Th_{Int1} = max([Th_{IntM1} , Th_{IntM2}]) \end{aligned}$$
(2)

where:

  • \(Th_{IntM1}\)ā€”threshold determined by the method proposed by the authors (Equation 1)

  • \(Th_{IntM2}\)ā€”the threshold value equals 10% \(I_{Max}\)

Automatically computed \(Th_{Int1}\) values (i.e.about 5%, 6%, 8%, 10%) depend on the image content. In case of some images, the computed \(Th_{Int1}\) allows to fully segment the image and the algorithm does not reach the second stage which requires computing \(Th_{Int2}\) . After determining the initial \(Th_{Int1}\) value, the algorithm starts image segmentation with the \(Th_{Int1}\) * \(I_{Max}\) value; it then eliminates the background and detects cluster of nuclei (Eq.Ā 3). At the same time, the resulting matrices \(M_{InitResult}(x,y,z)\) that represent the intensity values are created for each \(M_{Result}(x_R,y_R,z_R)\) element which is a cell nucleus or cluster of nuclei.

$$\begin{aligned} M_{3x3x3} (x_R,y_R,z_R )&= mean(Im_{3DSc} (x,y,z)) \\ M_{Result} (x_R,y_R,z_R )&= \left\{ \begin{array}{ccc} 1 &{} if &{} M_{3x3x3} (x_R,y_R,z_R ) > I_{Max} * Th_{Init}\\ 0 &{} &{} otherwise \end{array} \right. \end{aligned}$$
(3)

where:

  • \(Th_{Init}\) can take values in the range 0-1

  • \(I_{Max}\)ā€”maximum brightness in the 3D image

Next, each segmented cluster of nuclei is checked for size and shape as well as the possibility of its further division; it is then classified as \(NC_S\) or \(NC_G\). The assessment criteria for these cluster of nuclei are described in Eq.Ā (4):

$$\begin{aligned} NC_{Solid}&= \frac{NC_{Vol}}{NC_{VolConv}} \ge NC_{SolidThr} \\ NC_{Spher}&= \frac{ \root 3 \of { 36 * \pi * {NC_{Vol}}^2}}{NC_{Surf}} \ge NC_{SpherThr} \\ NC_{VolMin}&\le NC_{Vol} < NC_{VolMax} \end{aligned}$$
(4)

where:

  • \(NC_{Vol}\)ā€“3D nuclear volume

  • \(NC_{VolConv}\)ā€”convex hull volume of the 3D nucleus

  • \(NC_{Surf}\)ā€”3D nucleus surface

  • \(NC_{Spher}\)ā€”3D nucleus sphericity

  • \(NC_{SolidThr}\)ā€”threshold value of 3D solidityĀ =Ā 0.95

  • \(NC_{SpherThr}\)ā€”threshold value of 3D sphericityĀ =Ā 0.85

  • \(NC_{VolMin}\) and \(NC_{VolMax}\)ā€”min and max values of the 3D nuclear volumes in the training set

We characterized the nuclear size and shape by sphericity, solidity and volume which are properties of ellipsoid [37, 38]. To adjust \(NC_{SolidThr}\), \(NC_{SpherThr}\), \(NC_{VolMin}\) values , we measured JI and F1-score after substituting values from a wide range. The final values for \(NC_{SolidThr}\), \(NC_{SpherThr}\) and \(NC_{VolMin}\) were picked when JI and F1-score were the highest (Tables 1, 2, and 3).

Any \(NC_G\) segmented structure that does not meet the criteria for a single cell nucleus is reanalysed. This time, the algorithm uses a larger threshold: \(Th_{Int2}\) * \(I_{Max}\). As a result of increasing the local threshold, groups of cells (undivided in the first stage) are slightly reduced, which improves their separation from one another. The cells are again subjected to splitting. The proposed method uses the variable local threshold value and distance transform applied to the region in the image which contains clusters of cells in order to split them. It improves the efficiency of cell division and segmentation in specimens where the visibility of cells varied greatly or there was noise in a specific direction.

In the case of specimens in which the visibility of all cells was similar, the cells were easy to separate, there was no interference (e.g. in the form of a directional increase in noise level), and \(NC_S\) were efficiently isolated by the algorithm in the first step. When large groups of cells were in contact with each other after thresholding with global \(Th_{Int1} * I_{Max}\), only the remaining \(NC_G\) cluster of nuclei were reanalysed (with an increased threshold of \(Th_{Int2} * I_{Max}\)). The use of the variable local threshold made it possible to avoid too much reduction of the area of all cells in the specimen and to reduce only densely clustered cells nuclei.

FigureĀ 9 below shows the case of a high-content specimen for which the algorithm, in the first step (\(Th_{Int1} * I_{Max}\) threshold), segments the subset of single cells correctly, and the remaining groups that were not successfully segmented (marked in black) are transferred to the second stage (\(Th_{Int2} * I_{Max}\) threshold; Fig.Ā 9b). It can be seen that in the case of high cell confluency, not all cluster of nuclei may be segmented correctly in the first step and may contain adjacent sub-nuclei, as shown in Fig.Ā 9b).

Fig. 9
figure 9

Tested specimen (a), single cells nuclei (\(NC_S\)) correctly separated and groups of nuclei (\(NC_G\)ā€”in black) transferred for further splitting (b)

Only the remaining \(NC_G\) structures (cluster of nuclei) that are too large and those that could not be separated in the first step (the cases from Fig.Ā 9b, shown in Fig.Ā 10a) are re-segmented (with a higher threshold).

Fig. 10
figure 10

Second step of analysis of the high cell confluency specimen (a); division of the remaining \(NC_G\) cluster of nuclei from Fig. 9b, marked in black (b)

FigureĀ 10 shows the second step of division of the remaining specimen cluster of nuclei with the threshold \(Th_{Int2}\). The application of a higher local threshold value reduced the cell shapes, which enabled their separation. The cluster of nuclei of the cells adjacent in the first step (Fig.Ā 10a) are now segmented correctly (Fig.Ā 10b) (marked with different colours).

Segmentation and separation of nucleus groups

The segmentation in the \(M_{Result}\) matrix is performed locally based on the shape, size criteria, and verification of the possibility of division by combining the results of the 3D distance transform [39, 40] and the 3D watershed, according to the following assumptions regarding the parameters of a single nucleus: \(NC_S\) and Equation (4). The following parameters, which determine the correct shapes of nuclei in 3D, were designated experimentally on the basis of an additional training set that comprised 10 images with over 520 nuclei (the final values were estimated based on TablesĀ 1, 2, 3, and 4):

  • nucleus solidity parameter \(NC_{Solid}\), as in Eq.Ā (4)

  • nucleus sphericity parameter \(NC_{Spher}\), as in Eq.Ā (4)

  • nuclear volume parameter \(NC_{Vol}\) ranging from \(NC_{VolMin}=5*size(Im_{3DSc})*10^{-6}\) to \(NC_{VolMax}=2*size(Im_{3DSc})*10^{-2}\), related to the 3D image resolution, as in Equation (4)

  • parameter of the size of the separated structure (\(im_{Dist23}\)) on the basis of the modified 3D distance map according to the relationship: \(im_{Dist23}=(im_{Dist2}+2*im_{Dist3})\)

The nucleus group division algorithm is based on a modified 3D distance transform (\(im_{Dist23}\) matrix, which includes only currently analysed nucleus group) that takes into account the disproportions in the sizes of the images on the X, Y and Z axes. The values of the standard 3D distance transform are significantly reduced by the values in the Z axis, which makes it impossible to use this method directly to assess the size of the examined nuclei in 3D.

After separating the cell nucleus structures from the background, the algorithm classifies each structure as a single nucleus or a group of nuclei. Two types of nuclei structures may appear at this stage. The first one is a nucleus that meets the shape and size criteria; it is classified as a single, properly separated nucleus, \(NC_S\). The second type is a structure which does not meet the shape and size criteria, \(NC_G\). The algorithm attempts to divide it by analysing the results of the 3D distance transform (based on \(im_{Dist23}\)) and the 3D seed watershed. When the structure is not divided (the number of generated seeds is equal to 1 but the structure is greater than the threshold value, \(NC_{Vol}\)ā€”Eq.Ā 4); this means that it is a large structure of adjacent nuclei. Then, the entire \(NC_G\) cluster of nuclei is subjected to splitting with a higher local threshold, \(Th_{Int2}\). The process of cell nucleus division works the same as for \(Th_{Int1}\); only the input data differ. These are the nucleus structures separated from the background that were created after applying the local threshold \(Th_{Int2}\).

The analysed structure (cluster of nuclei or single nucleus) can be classified as \(NC_S\) in four cases:

  1. (a)

    in the first stage, when \(NC_{Solid} \ge 0.95\), \(NC_{Spher} \ge 0.85\) and \(NC_{Vol}\) is in the range from \(NC_{VolMin}\) to \(NC_{VolMax}\);

  2. (b)

    when it does not meet point (a) but is not too large and cannot be divided;

  3. (c)

    when it is divided into sub- nuclei, all sub-nuclei are added (as \(NC_{Single}\));

  4. (d)

    when it was not divided in the first stage (at \(Th_{Int1}\)), or it was too large and was not divided in the second stage (at \(Th_{Int2}\)) and there is no further possibility of splitting it or increasing the threshold.

FigureĀ 11 shows the steps of segmentation of the \(NC_G\) to be divided. After determining the value of the modified 3D distance transform (\(im_{Dist23}\)) in the section of the 3D matrix (\(M_{Result}(NC_{GID})\)) containing the tested \(NC_G\) cluster of nuclei, the acceptable size of nuclei for splitting is calculated with Eq.Ā (6). The performed tests resulted in adopting the limit values that define the minimum and maximum cell sizes (in relation to the value of the 3D distance transformā€”Eq.Ā 5 for the examined cluster of nucleiā€”\(NC_G\)). These values allow the separation of the generated seeds of the created cells that result from the analysis of voxel sets in different cell groups:

$$\begin{aligned} dist_{MaxTh}&= max(distanceTransform(M_{Result\_NCG})) \\ dist_{MinTh}&=0.5*dist_{MaxTh} \end{aligned}$$
(5)

The analysis of each \(NC_G\) cluster of nuclei begins with a threshold value for a modified 3D distance map equal to \(dist_{MaxTh}\) and decreasing to \(dist_{MinTh}\) (values are determined automatically for each analysed nuclei groupā€”the best factor value is estimated based on Table 4 ). The iteration step was \(\Delta _{Dist}=-0.2\) (the fastest analysis rate and nucleus division efficiency).

FigureĀ 11 presents the results of splitting two nucleus groups with different degrees of connection. FigureĀ 11a, b represents the \(NC_G\) cluster of nuclei from the example in Fig.Ā 6a (cluster of nuclei marked with a red frame). Both cluster of nuclei are correctly separated from the background, so the algorithm tries to divide them at a later stage. In the next stages (Fig.Ā 11c, d), it can be seen that as the threshold size for \(dist_{Th}\) decreases, and new seeds are generated that represent the cell centres within the group. For different cluster of nuclei, the number of iterations necessary to generate seeds is different: Fig.Ā 11c (2 iterations), Fig.Ā 11d (10 iterations).

Fig. 11
figure 11

Steps of splitting some selected cell groups from Fig. 6a (a, b). The selected iterations of the specimen (aā€“c, e). The selected iterations of the specimen (b, dā€“h). The results of splitting two selected cell gropus (i, j)

As the \(dist_{Th}\) value decreases in subsequent iterations, the size of the cell seed in the group increases and new ones may appear. Therefore, at this stage it is required to classify each seed as newly created or previously generated (each seed was enlarged in the next iteration). All seeds analysed in each iteration are classified as existing or new ones. The criterion used was the distance between the seed centroids (from the previous and actual iterations) and the sum of the values of the 3D distance map in the centroids of these seedsā€”Eq.Ā 6. The list of generated seeds is then passed to the 3D seed watershed method, whose task is to separate the voxels of nucleus groupsā€”\(NC_G\). Due to such initialization of the 3D seed watershed function, the efficiency of splitting is improved, which is particularly evident in the case of two cells that are very close together (Fig.Ā 11a and the result in Fig.Ā 11e). FigureĀ 11c, d show the seeds generated in subsequent iterations. The results of the last iteration are used to initialize segmentation by the 3D seed watershed. The division of the \(NC_G\) cluster of nuclei generates the resulting set of \(NC_S\)ā€”Fig.Ā 11eā€“f. The cluster of nuclei from Fig.Ā 11a is split into two nucleus and the cluster of nuclei from Fig.Ā 11b is split into 10 nucleus.

$$\begin{aligned} seed_{New} = \left\{ \begin{array}{ll} false &{} \quad if \,\, distCentr_{Act-Prev} < im_{Dist(Act)}+ im_{Dist(Prev)}\\ true &{} \quad if \,\, distCentr_{Act-Prev} \ge im_{Dist(Act)}+ im_{Dist(Prev)}\\ \end{array} \right. \end{aligned}$$
(6)

where:

  • \(distCentr_{Act-Prev}\)ā€”Euclidean distance between the seed centroids from the actual and previous iterations

  • \(im_{Dist(Prev)}\)ā€”value of the function of the 3D image distance in the cell centroid in the previous iteration

  • \(im_{Dist(Act)}\)ā€”value of the function of the 3D image distance in the cell centroid in the actual iteration

FigureĀ 12 presents another case of segmentation of a very large \(NC_{G1}\) cluster of nuclei (containing several dozen sub-nuclei and several dozen \(NC_S\)ā€”Fig.Ā 12b). This example shows how effectively the proposed method segments such large cluster of nuclei. Based on the results of the 3D distance transform in the entire \(NC_{G1}\), the algorithm automatically determines the size range of the nuclei contained in this cluster of nuclei. It then attempts to divide the high-content \(NC_{G1}\) cluster of nuclei. Some nuclei in the specimen are already separated in the first step (for \(Th_{Int1}\)); the undivided ones (blackā€”Fig.Ā 12c) are again subjected to splitting (after increasing the local threshold to \(Th_{Int2}\)). Dividing such a high-content cluster of nuclei directly using one method, e.g. 3D watershed, does not provide results as good as those of the proposed procedure, which combines the modified 3D distance transform, 3D watershed and adaptive cells nuclei segmentation. In the end, it can be observed that all individual nuclei are separated in the specimen in Fig.Ā 12a, regardless of the different degrees of connection between them (Fig.Ā 12d).

Fig. 12
figure 12

Segmentation of a large high-content complex cluster of nuclei. a The selected specimen, b first stage of segmentationā€“with global threshold (c) second stageā€“the segmentation of undivided cluster of nuclei with local threshold (d) final result

Adjustment of nuclear borders

The last stage of 3D image analysis comprises nucleus mask upsampling from 1:3 to 1:1 scale and nucleus surface refinement at 1:1 scale. Voxels inside cells are upsampled using the nearest neighbour interpolation. Initially, the surface voxels are upsampled in the same manner. However, since such upsampling yields a surface contour that is \({3}\times {3}\times {3}\) voxels thick, the contour thickness is reduced, and the nucleus shape is refined by surface voxel intensity thresholding. Voxels with intensity higher than \(I_{Th} * NC_{MeanInt}\) (where \(NC_{MeanInt}\) is the mean cell intensity excluding the surface, and \(I_{Th}\) is the threshold) are merged with the cell body. Voxels that do not meet this criterion are assigned to the background. \(I_{Th}\) was set experimentally using the training set: \(I_{Th}\) values higher than 0.3 significantly reduced the nucleus volume. Lower \(I_{Th}\) values had no effect on the final nucleus shape. Therefore, we set \(I_{Th}\) at 0.3. By applying this approach, the mean JI increased on average by 6% in the test set, finally reaching 82% (Table 6). FigureĀ 13 shows how this approach performed. As a result, the final shape and volume of the reconstructed cell more closely matches the expertā€™s mask (Fig.Ā f13cā€“f), and the JI reaches higher values.

Fig. 13
figure 13

Example of final nucleus shape adjustment and surface voxel classification: a, d Voxels of the cell surface after classification at 1:1 scale (cell voxelsā€”white; background voxelsā€”red); b, e comparison with the expertā€™s mask without shape adjustment; c, f comparison with the expertā€™s mask after shape adjustment (whiteā€”expertā€™s mask)

Evaluation of results

In order to evaluate the segmentation accuracy, two assessment criteria previously used in the literature were applied [11, 41, 42]. The first criterion estimates the number of correct and incorrect cell detections [41]; the second estimates the accuracy of tracing [42]. Accuracy was assessed on the basis of 27 ground-truth images containing nuclei outlined by the expert (2367 in total). The expertā€™s masks generated in 2D layers were compared with the 3D binary masks of the segmented nuclei generated by the algorithm. The accuracy of the algorithms was assessed by comparing the masks in the layers selected by the expert.

The subsequent rows in Fig.Ā 14 show the examined images; binary masks manually delineated by the expert; binary masks generated by the algorithm; and masks delineated by the algorithm and the expert combined in one image for each case.

Fig. 14
figure 14

Example results of the analysis: examples from Fig.Ā f2e, f, g, h. TP cases are marked in green; not properly separated nuclei are in yellow; FN cases are red; cases when the algorithm designated too small an area in relation to GT or FP cases are blue

Fig. 15
figure 15

Comparison of segmentation results by reference methods and the proposed method. TP cases are marked in green; not properly separated nuclei are in yellow; FN cases are red; cases when the algorithm designated too small an area in relation to GT or FP cases are blue.

The algorithm, after a full 3D analysis of specimens, nuclei segmentation and reconstruction, compared the results for the layer selected by the expert; for this layer, it calculated the effectiveness, precision, specificity, F1-Score and JI. The above images (Fig.Ā 14) show cellular specimens of varying confluency, high background noise, and the segmentation results obtained with the proposed algorithm. The cases in which the algorithm segmented the cell correctly are marked in green, whereas the cases in which the algorithm failed to effectively separate a group of cells or divided a single nucleus are highlighted in yellow. Red and blue are cases in which the mask delineated by the algorithm was too small or did not agree with the mask selected by the expert or was too small (less than 50% of the area of the expertā€™s mask).

It can be seen that in each case the background is completely eliminated, and the nuclei are properly separated from the background. There is a strong agreement between the generated binary masks of cells and the masks selected by the expert. The numbers of cell nuclei determined by the proposed method and the expert are almost identical. The algorithm rarely detects areas where cells are not actually present. In a few cases, it can be observed that the larger masks of cells delineated by the algorithm do not correctly cover the masks of nuclei traced by the expert (yellow in the bottom row). This is due to the fact that the algorithm did not properly separate the nuclei in the image. Therefore, when a large cell delineated by the algorithm covers more than 50% of a nucleus selected by the expert, it is not rechecked in terms of how much it covers another nucleus when calculating the effectiveness and agreement between the masks. TP refers to cases in which the algorithm correctly selected a nucleus mask which agreed (for over 50% of the area) with the mask selected by the expert. FP cases occur when the algorithm located an area that had been misclassified as a cell, or a single nucleus was unnecessarily divided. When comparing the algorithm results with the ground truth, if no nucleus matched the selected mask or the algorithm did not divide the nucleus group successfully, the cases were classified as FN. The JI for each nucleus was calculated as the quotient of the intersection and the sum of the binary masks of the cell segmented by the algorithm and marked by the expert. In contrast, the mean JI in the test image was calculated as the quotient of the sum of JI for all cells and the number of nuclei (Eq.Ā 7).

$$\begin{aligned} JI_{Avg} = \frac{\sum _{i=1}^{c_{count}} \frac{A_i \cap B_i}{A_i \cup B_i} }{c_{count}} \end{aligned}$$
(7)

where:

  • \(JI_{Avg}\)ā€”average JI in the examined specimen

  • \(A_i\)ā€”binary mask of the i-th cell delineated by the algorithm

  • \(B_i\)ā€”binary mask of the i-th cell delineated by the expert

  • \(c_{count}\)ā€”number of detected nuclei

The results of the efficacy evaluation using the number of correct and incorrect detections as well as the JI are presented in Table 8 for the cases from Fig.Ā 14.

Table 8 Results obtained for the images from Fig.Ā 14

The results obtained for all the images made it possible to compare the effectiveness of the proposed method with other reference solutions (described in the ā€œResultsā€ sectionā€”TablesĀ 6, 7).

Availability of data and materials

Dataset and source codes are available at: https://doi.org/10.5281/zenodo.4462097.

Abbreviations

HCS:

High-content screening

\(NC_{S}\) :

A single nucleus

\(NC_{G}\) :

A group of nuclei

FN:

False negative

FP:

False positive

TN:

True negative

TP:

True positive

JI:

Jaccard Index

References

  1. Bogusz AM, Baxter RH, Currie T, Sinha P, Sohani AR, Kutok JL, Rodig SJ. Quantitative immunofluorescence reveals the signature of active b-cell receptor signaling in diffuse large b-cell lymphoma. Clin Cancer Res. 2012;18(22):6122ā€“35.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  2. Kamykowski J, Carlton P, Sehgal S, Storrie B. Quantitative immunofluorescence mapping reveals little functional coclustering of proteins within platelet Ī±-granules. Blood J Am Soc Hematol. 2011;118(5):1370ā€“3.

    CASĀ  Google ScholarĀ 

  3. Gertych A, Farkas DL, Tajbakhsh J. Measuring topology of low-intensity DNA methylation sites for high-throughput assessment of epigenetic drug-induced effects in cancer cells. Exp Cell Res. 2010;316(19):3150ā€“60.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  4. Li S, Xia M. Review of high-content screening applications in toxicology. Arch Toxicol. 2019;93(12):3387ā€“96.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  5. Zock JM. Applications of high content screening in life science research. Comb Chem High Throughput Screen. 2009;12(9):870ā€“6.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  6. Czech E, Aksoy BA, Aksoy P, Hammerbacher J. Cytokit: a single-cell analysis toolkit for high dimensional fluorescent microscopy imaging. BMC Bioinform. 2019;20(1):1ā€“13.

    ArticleĀ  CASĀ  Google ScholarĀ 

  7. Meijering E, Dzyubachyk O, Smal I. Methods for cell and particle tracking. Methods Enzymol. 2012;504:183ā€“200.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  8. Toyoshima Y, Tokunaga T, Hirose O, Kanamori M, Teramoto T, Jang MS, Kuge S, Ishihara T, Yoshida R, Iino Y. Accurate automatic detection of densely distributed cell nuclei in 3D space. PLoS Comput Biol. 2016;12(6):1004970.

    ArticleĀ  CASĀ  Google ScholarĀ 

  9. Beisken S, Meinl T, Wiswedel B, de Figueiredo LF, Berthold M, Steinbeck C. KNIME-CDK: workflow-driven cheminformatics. BMC Bioinform. 2013;14(1):1ā€“4.

    ArticleĀ  Google ScholarĀ 

  10. Carpenter AE, Jones TR, Lamprecht MR, Clarke C, Kang IH, Friman O, Guertin DA, Chang JH, Lindquist RA, Moffat J, et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006;7(10):1ā€“11.

    ArticleĀ  CASĀ  Google ScholarĀ 

  11. Gertych A, Ma Z, Tajbakhsh J, VelĆ”squez-Vacca A, Knudsen BS. Rapid 3-D delineation of cell nuclei for high-content screening platforms. Comput Biol Med. 2016;69:328ā€“38.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  12. Leichner J, Lin W-C. Advances in imaging and analysis of 4 fluorescent components through the rat cortical column. J Neurosci Methods. 2020;341: 108792.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  13. Ruszczycki B, Pels KK, Walczak A, Zamłyńska K, Such M, Szczepankiewicz AA, Hall MH, Magalska A, Magnowska M, Wolny A, et al. Three-dimensional segmentation and reconstruction of neuronal nuclei in confocal microscopic images. Front Neuroanat. 2019;13:81.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  14. Dima AA, Elliott JT, Filliben JJ, Halter M, Peskin A, Bernal J, Kociolek M, Brady MC, Tang HC, Plant AL. Comparison of segmentation algorithms for fluorescence microscopy images of cells. Cytometry A. 2011;79(7):545ā€“59.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  15. Harder N, Mora-BermĆŗdez F, Godinez WJ, WĆ¼nsche A, Eils R, Ellenberg J, Rohr K. Automatic analysis of dividing cells in live cell movies to detect mitotic delays and correlate phenotypes in time. Genome Res. 2009;19(11):2113ā€“24.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  16. Dzyubachyk O, Van Cappellen WA, Essers J, Niessen WJ, Meijering E. Advanced level-set-based cell tracking in time-lapse fluorescence microscopy. IEEE Trans Med Imaging. 2010;29(3):852ā€“67.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  17. Lin G, Chawla MK, Olson K, Barnes CA, Guzowski JF, Bjornsson C, Shain W, Roysam B. A multi-model approach to simultaneous segmentation and classification of heterogeneous populations of cell nuclei in 3D confocal microscope images. Cytometry A. 2007;71(9):724ā€“36.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  18. Wang M, Zhou X, King RW, Wong ST. Context based mixture model for cell phase identification in automated fluorescence microscopy. BMC Bioinform. 2007;8(1):1ā€“12.

    ArticleĀ  CASĀ  Google ScholarĀ 

  19. Koyuncu CF, Arslan S, Durmaz I, Cetin-Atalay R, Gunduz-Demir C. Smart markers for watershed-based cell segmentation. PLoS ONE. 2012;7(11):48664.

    ArticleĀ  CASĀ  Google ScholarĀ 

  20. Lin G, Bjornsson CS, Smith KL, Abdul-Karim M-A, Turner JN, Shain W, Roysam B. Automated image analysis methods for 3-D quantification of the neurovascular unit from multichannel confocal microscope images. Cytometry A. 2005;66(1):9ā€“23.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  21. WƤhlby C, Sintorn I-M, Erlandsson F, Borgefors G, Bengtsson E. Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections. J Microsc. 2004;215(1):67ā€“76.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  22. Dufour A, Shinin V, Tajbakhsh S, GuillĆ©n-Aghion N, Olivo-Marin J-C, Zimmer C. Segmenting and tracking fluorescent cells in dynamic 3-D microscopy with coupled active surfaces. IEEE Trans Image Process. 2005;14(9):1396ā€“410.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  23. Padfield D, Rittscher J, Thomas N, Roysam B. Spatio-temporal cell cycle phase analysis using level sets and fast marching methods. Med Image Anal. 2009;13(1):143ā€“55.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  24. Wang M, Zhou X, Li F, Huckins J, King RW, Wong ST. Novel cell segmentation and online SVM for cell cycle phase identification in automated microscopy. Bioinformatics. 2008;24(1):94ā€“101.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  25. Bjornsson CS, Lin G, Al-Kofahi Y, Narayanaswamy A, Smith KL, Shain W, Roysam B. Associative image analysis: a method for automated quantification of 3d multi-parameter images of brain tissue. J Neurosci Methods. 2008;170(1):165ā€“78.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  26. Cheng J, Rajapakse JC, et al. Segmentation of clustered nuclei with shape markers and marking function. IEEE Trans Biomed Eng. 2008;56(3):741ā€“8.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  27. Tokuoka Y, et al. 3d convolutional neural networks-based segmentation to acquire quantitative criteria of the nucleus during mouse embryogenesis. npj Syst Biol Appl. 2020;6(32):1ā€“12.

    Google ScholarĀ 

  28. Chentao Wen TM, et al. 3DeeCellTracker, a deep learning-based pipeline for segmenting and tracking cells in 3D time lapse images. Comput Syst Biol Neurosci. 2021;10(e59187):1ā€“37.

    Google ScholarĀ 

  29. Dunn KW, Fu C, Ho DJ, et al. DeepSynth: three-dimensional nuclear segmentation of biological images using neural networks trained with synthetic data. Sci Rep. 2019;9:1ā€“15.

    Google ScholarĀ 

  30. Din NU, Yu J. Unsupervised deep learning method for cell segmentation. bioRxiv.2021; 1ā€“26

  31. Ozgun C, Abdulkadir A, Linekamp SS, et al. 3D U-Net: learning dense volumetric segmentation from sparse annotation. Comput Vis Pattern Recognit. 2016; 1ā€“8

  32. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Comput Vis Pattern Recognit. 2015; 1ā€“10

  33. Sternberg SR. Biomedical image processing. Computer. 1983;16(01):22ā€“34.

    ArticleĀ  Google ScholarĀ 

  34. Zack GW, Rogers WE, Latt SA. Automatic measurement of sister chromatid exchange frequency. J Histochem Cytochem. 1977;25(7):741ā€“53.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  35. Panneton B. Gray image thresholding using the triangle method. MATLAB Central File Exchange. Retrieved January 01, 2021 (2010). https://www.mathworks.com/matlabcentral/fileexchange/28047-gray-image-thresholding-using-the-triangle-method

  36. Gertych A, Oh JH, Wawrowsky KA, Weisenberger DJ, Tajbakhsh J. 3-D DNA methylation phenotypes correlate with cytotoxicity levels in prostate and liver cancer cell models. BMC Pharmacol Toxicol. 2013;14(1):1ā€“21.

    ArticleĀ  CASĀ  Google ScholarĀ 

  37. Irving-Cruz-Matiasa DA, et al. Sphericity and roundness computation for particles using the extreme vertices model. J Comput Sci. 2019;30:28ā€“40.

    ArticleĀ  Google ScholarĀ 

  38. Renjie Wang AK, et al. High resolution microscopy reveals the nuclear shape of budding yeast during cell cycle and in various biological states. J Cell Sci. 2016;129(24):4480ā€“95.

    PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  39. Mishchenko Y. 3D Euclidean distance transform for variable data aspect ratio. MATLAB Central File Exchange. Retrieved January 01, 2021. https://www.mathworks.com/matlabcentral/fileexchange/15455-3d-euclidean-distance-transform-for-variable-data-aspect-ratio

  40. Mishchenko Y. A fast algorithm for computation of discrete Euclidean distance transform in three or more dimensions on vector processing architectures. SIViP. 2015;9(1):19ā€“27.

    ArticleĀ  Google ScholarĀ 

  41. Coelho LP, Shariff A, Murphy RF. Nuclear segmentation in microscope cell images: a hand-segmented dataset and comparison of algorithms. In: 2009 IEEE international symposium on biomedical imaging: from nano to macro. 2009; IEEE, pp 518ā€“21

  42. MaÅ”ka M, Ulman V, Svoboda D, Matula P, Matula P, Ederra C, Urbiola A, EspaƱa T, Venkatesan S, Balak DM, et al. A benchmark for comparison of cell tracking algorithms. Bioinformatics. 2014;30(11):1609ā€“17.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  CASĀ  Google ScholarĀ 

Download references

Acknowledgements

Not applicable.

Funding

This publication was funded by University of Silesia, Faculty of Science and Technology, Institute of Biomedical Engineering. This publication was co-funded by AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering under grant number 16.16.120.773.

Author information

Authors and Affiliations

Authors

Contributions

MM: Conceptualization, Methodology, Software, Validation, Investigation, Resources, Data Curation, Dataset Preparing, Writingā€”Original Draft Preparation, Writingā€”Review & Editing, Visualization, Supervision, Project Administration, Funding Acquisition; AP: Investigation, Data Curation, Writingā€”Review & Editing, Visualization, Funding Acquisition; AG: Conceptualization, Methodology, Validation, Investigation, Data acquisition and availability, Data Curation, Writingā€”Original Draft Preparation, Writingā€”Review & Editing, Supervision, Project Administration. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mariusz Marzec.

Ethics declarations

Ethics approval and consent to participate

All methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication

Not Applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisherā€™s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marzec, M., PiĆ³rkowski, A. & Gertych, A. Efficient automatic 3D segmentation of cell nuclei for high-content screening. BMC Bioinformatics 23, 203 (2022). https://doi.org/10.1186/s12859-022-04737-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12859-022-04737-4

Keywords