Development of the ontology
The application ontology was developed following the Basic Formal Ontology (BFO, https://basic-formal-ontology.org/) formalism and conforms to the requirements of OBO Foundry ontologies. The ontology was specified in the Web Ontology Language and developed using the Protégé (version 4.5) ontology editor. We searched OntoBee (http://www.ontobee.org/) and BioPortal (http://bioportal.bioontology.org/) for equivalent concepts in existing ontologies. To avoid duplication, we imported and used these equivalent terms. If no equivalent term was found, we created new terms and definitions and submitted them to an appropriate OBO Foundry ontology to consider for inclusion. We relied on Uberon [23] for anatomical structures, Cell Ontology [24,25,26] for cell types, Protein Ontology [27] for proteins, Information Artifact Ontology [28] for experimental protocol metadata, and the Relation Ontology [29] for relations between entities. In addition, we ensured that every anatomical term in the application ontology mapped to a corresponding term in the LungMAP ontology [30], which provides a comprehensive anatomical ontology of the developing lung.
Queries to the ontology are made using SPARQL. The application ontology contains the set of anatomical structures related to a specific antibody probe, including descriptions of the intermediate tissue and cell types that make up those structures. The ontology also contains common variations, or aliases, of the various probes commonly used in immunofluorescent microscopy and provides a standard unambiguous label for each probe.
All the relevant data elements were collected from the Bioinformatics REsource ATlas for the Healthy lung (BREATH) database and transformed into terms and relational expressions, mostly reused from other ontologies (Additional file 1: Table 2). The most critical metadata about the images are the fluorescent-labeled probes, which are antibodies against specific biological proteins. For this reason, we used a general class of Antibody Reagent from the Reagent Ontology (REO, http://www.ontobee.org/ontology/REO), and connected it to the targeted protein by the use of the relation recognizes adopted from REO and antibody ontology AntiO [31]. We then linked a specific cell type to the protein (recognized by the antibody reagent) via a has_part relation. For example, cell X has_part some protein Y and Antibody Reagent Z recognizes some protein Y. In turn, tissues link to cells, and higher-order anatomical structures link to tissues via the same relationship. The graph illustrating these relationships is shown in Fig. 3a.
To capture the temporal nature of the development process, the ontology also includes age information. We used an existence_starts_at_point relation to capture the relationship between visible entities and age. This relation links an anatomical structure with the specific age at which that structure is developed and is visible within an image. As the images are from samples at only a few specific ages (e.g. E16.5), we used discrete categories for age. For other applications where a continuous time model is desirable, we would define overlapping intervals at which each entity can be found in the development process.
The ontology is used in the process of labeling training data, image segmentation, and mapping of image segments to named anatomical entities. When labeling training data, the ontology provides the list of possible anatomical structures for a given development age. For segmentation, the ontology provides contextual information regarding the location of probes, such as whether a probe is located on the periphery of a structure or found within it. For classification, the ontology constrains the anatomical entity candidate labels allowed given the inferred knowledge about the image and specific segment.
The inference over the ontology requires chained SPARQL queries between a set of probes and their related structures that can be time intensive. Hence, relationships between probes and structures are first determined using a set of SPARQL queries and then saved as a look-up table (LUT) for quicker access in the image analysis application. These LUTs can be re-generated programmatically when the ontology is updated. The integration of the ontology into the image processing pipeline is illustrated in Fig. 3b.
Development of image analysis pipeline
The image analysis pipeline we developed is tailored to the unique properties of immunofluorescent microscopy images, particularly targeting cases where regions of interest are densely packed and numerous. We chose to target these types of images as they are difficult to segment accurately using current computer vision pipelines and highly laborious to segment manually. The pipeline was developed in the Python programming language and is available as an open-source package named odifmap (ontology-driven immuno-fluorescence microscopy analysis pipeline).
Our pipeline makes use of metadata associated with each image sample, including species, development age, image magnification, names of the antibody probes, and the colors associated with those probes. Together, this information defines distinct image sets that are analyzed in separate runs of the pipeline. Training data for a single entity consists of an ontology-defined label and a list of vertices given as x–y coordinates that defines the boundary of a polygonal region.
Development of feature metrics
Classical machine learning approaches require the reduction of pixel-level data to image feature vectors. The limited colors and gradients in immunofluorescent images can be usefully exploited to generate a limited set of quantitative feature metrics useful for distinguishing between biologically relevant segmentation classes. These features are based on color composition and distribution, region and sub-region shape statistics such as eccentricity and convexity, and simple texture statistics like hole area ratio (HAR).
We began the development of features using manually segmented regions and then extracted these regions into groups based on their labels. The extracted regions were visually inspected and analyzed using histograms of their HSV channels. The colors present in all of the images in the LungMAP database are blue: representing the nuclei stained with DAPI, black: representing non-cellular empty space, or "background" due to dark-field microscopy, and red, green, and white: typically used as fluorescent colors in immunofluorescent microscopy. Additionally, we observed that for small areas where a targeted antibody was expressed, the fluorescent color would interact with the blue DAPI stain to produce composite colors.
Based on these observations and the histograms of the hue channel in the regions we manually segmented, we partitioned the hue range into 3 main sections we term "major" colors, targeting red, green, and blue. We also binned the value channel into 3 monochromatic colors of black, gray, and white. Finally, we defined "minor" colors as the composite colors of violet, cyan, and yellow. With these 9 color categories established, the entire HSV space was partitioned such that any combination of hue, saturation, and value belongs to a unique color class. The complete definition of HSV ranges for the color classes are shown in Additional file 1: Table 4.
In total, there are 179 feature metrics calculated for each region, consisting of 19 metrics repeated for each of the 9 color classes and 8 global contour features. While the LungMAP data sets employ 3 fluorescence colors, more colors can be used in immunofluorescent microscopy [32]. With this in mind, we designed the analysis pipeline library to calculate feature metrics dynamically based on a configurable set of color definitions. By default, the library will use the 9 color definitions described above.
Image pre-processing
Once an image set is selected for analysis, its images are preprocessed to be homogeneous with regard to intrinsic variables, such as illumination and color. In microscopy images, illumination inhomogeneity effects occur both within an image and between images. Intra-image variation in light intensity results from a focal light source and manifests as a non-uniform intensity across the field of view (FOV), with higher intensities near the center of the image and trailing out toward the edges. Intra-image FOV variation can be significant, with up to 30% variation in intensity from the center to the darkest edge [33]. Inter-image variation can occur in each of the hue, saturation and value channels. Differences in the placement and brightness of the focal light source, variations in the staining process, and photobleaching are among the sources contributing to intra-image variations of color and intensity. These effects are responsible for images that appear “washed-out” (low saturation values) or images where the DAPI (4′,6-diamidino-2-phenylindole) stained nuclei appear greenish-blue (variation in hue).
Correcting both inter- and intra-image variation is critical for both the segmentation and classification of ROIs. Regions of a particular structure class may be under- or over-segmented due to their location within the FOV of an image and the image to which they belong. The number of pixels in a region due to the quality of segmentation, coupled with the differences in hue, saturation, and value of pixels within a region can greatly affect the feature metric calculations. The feature metric variance for regions within a class can, in turn, then affect the performance of the classification results. Figure 4 shows the effect of these preprocessing steps on homogenizing images in an image set.
To correct the intra-image variation due to the single light source, we create a mask from all the blue pixels in the image. Blue pixels represent the DAPI-stained cell nuclei and are the most common and evenly distributed color in the microscopy images we analyzed. The blue mask is applied to the value channel of the image and the masked value data is fitted using a bivariate Gaussian. The Gaussian fit is then inverted and sampled at each pixel coordinate of the image, and then summed with the original unmasked value channel. This process is repeated for every image in the image set.
Following value gradient correction, the inter-image color variation is corrected, starting with the selection of a reference image from the image set. The blue mask is extracted from each gradient corrected image to calculate the mean hue value of those pixels. The image with a mean hue closest to the "true blue" of 120 (on a 180-value hue scale) is chosen as the reference image. The images are converted to International Commission on Illumination (CIE) L*a*b space to transfer the color profile from the reference image to the other images in the image set [34].
Generation of the signal mask
The homogenized images are processed to enhance visual contrast of the structures targeted by the probes. In the LungMAP images, blue indicates DAPI staining of nucleated cells and is not utilized as a fluorescent probe color. By clipping the blue channel in RGB color-space to the maximum of the red and green channels, areas of high blue intensity is suppressed. This reduces the saturation and value of regions not targeted by any fluorescent probes, thus increasing visual contrast of the borders in adjacent anatomical structures.
Each contrast enhanced image is then converted to a hue, saturation, and value (HSV) color space, where the value channel is used as input to a difference of Gaussians routine to generate a binary signal mask that highlights boundaries in visually contrasting areas. The boundaries of neighboring structures within the signal mask are often connected, and single structures may be incomplete, preventing the use of the signal mask to segment regions directly. Instead, the HSV image is used for segmentation of candidate regions of interest (ROI) based on color and saturation, and those regions are dilated and compared to the signal mask to infer the boundaries of the final ROIs (Fig. 5).
Customizable segmentation pipeline
The segmentation pipeline uses a customizable sequence of segmentation stages. Regions generated in each stage are removed from subsequent stages, i.e. the residual is used as the input for the next stage ensuring that there are no overlapping regions generated (Fig. 6). There are two types of segmentation stages, one based on color and the other based on saturation. The type and number of segmentation stages is configurable in the odifmap Python library by a list of dictionaries, where each dictionary defines a segmentation stage, and each stage is executed in the order it appears in the list. Configurable parameters for each stage are used to target structures of a particular size range, and include a 2-D blur kernel size, a minimum ROI size, and a maximum ROI size. Color stages require an additional parameter, listing the color names to use for targeting specific structures containing a combination of fluorescent probes in addition to the targeted size range. The colors listed in a single-color stage correspond to the color definitions described previously, and their HSV ranges are combined to create masks targeting structures using that particular combination of probe colors (Additional file 1: Figure 3).
Post-processing of segmented regions
The contours found at each segmentation stage are accepted or rejected by iteratively dilating them against the signal mask. This is necessary as the individual segmentation stages target probes, which are generally bound to sub-cellular components, and the resulting regions may be under-segmented with respect to the complete cellular and structural boundaries. As described previously, the signal mask provides more accurate boundaries for regions of visual contrast but cannot be used to directly generate regions of interest due to its inter-connection with adjacent structures and broken borders of single structures. The final size for the ROI is determined by calculating the percentage of “on” pixels from the signal mask and choosing the iteration where the signal percentage falls off rapidly (Fig. 5).
The contours generated by the automated segmentation stages often contain more detail than is necessary for quantitative analysis. This is manifested by an excessive number of vertices defining the perimeter of the region, resulting in a significant increase in the storage size of the arrays defining the regions and a less appealing visual appearance. Thus, the region boundaries are smoothed using the Douglas-Peucker algorithm [35] as implemented in OpenCV [36] to reduce the number of vertices, reducing storage volumes and increasing visual appeal while having an insignificant effect on the generated feature metrics passed on to the classification routine.
Region classification
The pixel data in the post-processed regions is extracted and utilized to calculate the feature metrics described previously. The feature matrix is then scaled and passed to a classical machine learning pipeline for classification. After evaluating multiple classification strategies, including several deep learning neural networks, as well as more traditional machine learning techniques such as SVM, the most performant classifier was XG-Boost (XGB) [37].
The classification pipeline returns the predicted label along with the probabilities for each structure class. The probabilities can then be used to optionally accept or reject regions, for example, by choosing the class with maximum posterior probability. As noted above, if any of the images in the training data were comprehensively segmented, the prediction results will also include the probability of belonging to the image 'background'. Since all classified regions will likely have a category where the probability is highest (probability ties are rare), the inclusion of a background class reduces the occurrence of false positives due to partial segmentation or gross over-segmentation of structures, as well as ROIs that were generated in regions where no structure exists. Regions classified as “background” are then easily filtered from the final results.
Cell segmentation
Once regions are classified, an optional final step in the pipeline is to sub-segment the structure into its component cells. Here, the ontology also plays a role in determining how to segment the structure into cells. The ontology provides the list of tissues and cell types found within a structure as well as whether the probes present in the structure are on the periphery or in the interior of the structure. Together with the intracellular DAPI staining and the fact that black represents non-cellular areas, the non-peripheral probe colors are used to segment areas most likely to contain cells. We developed a recursive binary segmentation procedure to identify nested structures within a region, where each region is recursively partitioned into two sections using spectral clustering, stopping at a configurable minimum size (Fig. 6b). The cell contours are then returned from this recursive binary spectral clustering for display or further analysis. The ontology can provide the anatomical context to tightly constrain the substructures, such as cell types, that can be found within a structure such as the distal acinar tubule. If desired, a rule-based mechanism can then use the ontology to label substructures based on the anatomical constraints and distribution of the dominant probes (Additional file 1: Figure 2).