CP-CHARM: segmentation-free image classification made accessible

Background Automated classification using machine learning often relies on features derived from segmenting individual objects, which can be difficult to automate. WND-CHARM is a previously developed classification algorithm in which features are computed on the whole image, thereby avoiding the need for segmentation. The algorithm obtained encouraging results but requires considerable computational expertise to execute. Furthermore, some benchmark sets have been shown to be subject to confounding artifacts that overestimate classification accuracy. Results We developed CP-CHARM, a user-friendly image-based classification algorithm inspired by WND-CHARM in (i) its ability to capture a wide variety of morphological aspects of the image, and (ii) the absence of requirement for segmentation. In order to make such an image-based classification method easily accessible to the biological research community, CP-CHARM relies on the widely-used open-source image analysis software CellProfiler for feature extraction. To validate our method, we reproduced WND-CHARM’s results and ensured that CP-CHARM obtained comparable performance. We then successfully applied our approach on cell-based assay data and on tissue images. We designed these new training and test sets to reduce the effect of batch-related artifacts. Conclusions The proposed method preserves the strengths of WND-CHARM - it extracts a wide variety of morphological features directly on whole images thereby avoiding the need for cell segmentation, but additionally, it makes the methods easily accessible for researchers without computational expertise by implementing them as a CellProfiler pipeline. It has been demonstrated to perform well on a wide range of bioimage classification problems, including on new datasets that have been carefully selected and annotated to minimize batch effects. This provides for the first time a realistic and reliable assessment of the whole image classification strategy. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0895-y) contains supplementary material, which is available to authorized users.

1 Composition of the CHARM feature vector Figure S1: Composition of the CHARM vector as proposed by [1]. The di↵erent feature groups are highlighted in the top row. Features extracted on higher image levels are denoted by a dagger ( †), star (⇤) and a sharp (#).
The CHARM (Compound Hierarchy of Algorithms Representing Morphology) vector is composed of 1025 elements extracted from the original image, as well as from transforms of the image. This large collection of measurements fall into four main categories, which are further composed of several measurement sets: high contrast features, polynomial decompositions, pixel statistics, and textures, all extracted from gray-scale images. The high contrast features category contains information about the elements that compose the image, such as edges and shapes. The textures category contains some well-known texture descriptors such as the Haralick [2] and the Tamura [3] texture features. The pixel statistics group is composed of information about pixel values distribution over the image, such as histograms with various numbers of bins and statistical moments. Finally, the polynomial decompositions category is built by generating a polynomial that approximates pixel values up to a given error such as the Zernike or the Chebyshev polynomials, and whose coe cients describe the image. Some of these features are extracted from the original image as well as from transforms and from compound transforms (transforms of transforms) of the image. The Fourier transform, Wavelet transform, Chebyshev transform and composition of these transforms are used to give additional higher level interpretation of the image content, for instance, in the case of the Fourier transform, by revealing the frequency composition of the image. The construction of the CHARM vector is illustrated in Figure S1. A formal definition of each of these elements along with appropriate references is given in [1].

Integration in CellProfiler
The feature extraction step of CP-CHARM is carried out in CellProfiler [4], as depicted in Figure S2. The 953 measurements required to build the feature vector are extracted through a user-friendly "pipeline" composed of di↵erent modules. This construction o↵ers greater modularity and flexibility: the user is able to easily tune the feature vector content, which would not be possible using the original WND-CHARM implementation without significant programming expertise. Here, the removal of existing modules, the introduction of new ones or the modification of existing modules settings in the pipeline is made easy through CellProfiler's interface, as depicted in Figure S2.
Most features extracted by our method require pixel-based computations, which can be computationally expensive for large-scale images. As feature extraction is performed in CellProfiler, which allows for e cient batch processing, one way of dealing with such data is to tile them into smaller chunks to be processed in batch mode. The final classification result for the large image can then be retrieved as the most represented class among classification results from the tiles, or as a percentage of the tiles.

Study of validation methods
Using our Python implementation of WND-CHARM, which has been shown to give similar results as the published C++ version [5], we performed repeated rounds of training and testing on WND-CHARM's reference suite. As choosing a di↵erent validation method than what was initially proposed Figure S2: CellProfiler user interface with an example pipeline composed of several modules. Each module contains a description of its parameters, which can be modified through the interface. Modules can easily be added to or removed from the pipeline.
in WND-CHARM had a direct impact on the measure of performance of the algorithm, we investigated the e↵ect of changing WND-CHARM's validation method while keeping the rest of the algorithmic structure unchanged.
First, we gathered results using WND-CHARM's original custom validation method, which we refer to as lone 4-fold cross-validation. Then, in the same experimental conditions, we changed the validation method for 10-fold cross-validation while keeping all other parts of the algorithm unaltered. The results on WND-CHARM's reference suite, presented in Table S1, show that similar median classification accuracies are obtained with either validation method. Using 10-fold cross-validation, we however observe much smaller standard deviations, implying that classification results are more stable over the di↵erent repetitions of the experiment. Considering N rounds of training and testing, this can easily be explained from the fact that 10-fold crossvalidations results are subject to two steps of averaging. They correspond to averages of N results, which are themselves obtained by averaging classification results over each of the possible combinations of 10 given folds. Conversely, lone 4-fold cross-validation results are obtained by direct averaging of N classification results for a particular 3 4 versus 1 4 partitions of the data into training and testing sets. The outcome of this experiment allows us to safely further compare results obtained with WND-CHARM and CP-CHARM, at least regarding median classification accuracy, even though their validation methods di↵er.        Figure S3: Misclassification rates per non-overlapping feature "groups". To be continued on next page.

Supplementary Figures
(g) Figure S3: Misclassification rates per non-overlapping feature "groups". The label under each bar corresponds to the group that has been removed from the CHARM vector for classification using WND (each element listed in Figure S1 belongs to one of the groups). The rightmost box corresponds to the reference classification accuracy distribution using the whole feature set. Results range from 0 (0%) to 1 (100%) and were obtained over 10 runs of training and testing using 10-fold cross-validation. We note that COIL-20 poses a significantly simpler classification problem compared to other datasets; there is very little variation in the misclassification rates when dropping any single group of features, except for the Textures group.
(e) (f) Figure S4: Misclassification rates per non-overlapping feature "levels". To be continued on next page.
(g) Figure S4: Misclassification rates per non-overlapping feature "levels". The label under each bar corresponds to the level that has been removed from the CHARM vector for classification using WND (v1: features extracted from the original image only, v2: features extracted from transforms of the image, v3: features extracted from transforms of transforms of the image). The rightmost box corresponds to the reference classification accuracy distribution using the whole feature set. All results range from 0 (0%) to 1 (100%) and were obtained over 10 runs of training and testing using 10-fold cross-validation. shows misclassification rates when datasets are being classified using a feature vector version where one particular group (the one labeled below each boxplot) has been removed from the feature vector. One would expect a group with strong weights in the above plots to significantly a↵ect misclassification rate when removed (which could be observed in Figure S3). In some cases, the removal of feature groups containing strong weights indeed impacts classification accuracy (e.g., the Textures group in (e), as seen in Figure S3e). In some other situations, however, feature groups collecting strong weights do not seem to play an important role in the final classification result (e.g., the Radon group in (f), as seen in Figure S3f).  Figure S8: Misclassification rates over 10 runs of training and testing using the CHARM-like feature vector and di↵erent classifiers.
To be continued on next page.
(c) (d) Figure S8: Misclassification rates over 10 runs of training and testing using the CHARM-like feature vector and di↵erent classifiers.
To be continued on next page.