Template matching for object-localization is easy to handle and understand by non-experts. It relies on normalized grey level comparisons between a template image and successive image patches from a sliding window. It considers the full intensity pattern of the template image that usually includes object and surrounding context information. This intensity signature has a powerful discriminative power, even upon occlusion of the object or noise in the image, and the normalized score renders the detection robust against shifted illumination conditions (Additional file 7: Figure S4 and Additional file 8: Figure S5). However, template matching is limited to searching for a single intensity pattern, rendering it sensitive to changes in orientation or perspective of objects. In our approach, we overcome this limitation of existing implementations by combining results from multiple template images to increase the range of detectable intensity patterns. We then used a custom NMS strategy based on the degree of overlap between predicted bounding boxes to prevent redundant detections of a given object. Using the overlap for NMS is the most generic way to filter detections by bounding boxes of possibly different sizes and aspect-ratios.
Template images are typically generated by cropping objects of interest from source images, but we also provide the automated generation of additional templates through geometrical transformations. To ensure user-friendliness, we restricted the optional template transformations to flipping and rotation, which represent the common object-transformations expected in microscopy images. However, the tool accepts an arbitrary number of template images representing other potential transformations such as scaling or distortion. While our implementation accepts discrete rotation angle values, rotational search should be balanced with required detection efficiency to keep the overall number of generated templates and thus computation time low. In this study, image examples originating from zebrafish screening studies [17] were obtained using sample mounting strategies to constrain the rotational orientation of specimen [18], thus preventing the need for rotational search and providing datasets that can be rapidly profiled with template matching approaches. A number of methods have been reported in the literature to prevent repeated searches with rotated templates [27,28,29,30,31,32]. However, the complexity of these approaches limits their usability by non-experts, or their implementation is not publicly available.
Template matching can also be used for supervised classification, e.g. for nearest-neighbour search based on a set of annotated templates. This method has the advantage that it does not require any pre-processing of the image, or computation and selection of features. However, the quality of the classification depends heavily on the choice of the templates, and the method is rotation and scale sensitive.
Besides the number of templates, the computation time is a function of image and template sizes. As illustrated in the result section (Additional file 7: Figure S4D, Additional file 9: Figure S6D, Additional file 11: Figure S8D and Additional file 1: Movie S1), the processing speed can be drastically increased by limiting the analysis to a search region in which the object of interest is expected. Alternatively, the search could be performed with downscaled versions of the image and templates followed by rescaling and placement of bounding boxes. Yet, because downscaling degrades the searched intensity pattern, this approach may lead to non-specific and potentially shifted detections; therefore, we do not provide this option in the Fiji and KNIME versions. However, advanced users can refer to the online tutorial of the python implementation for an example of how to use downscaling to accelerate the detection (see Additional file 13).
The current implementation is mainly targeted towards analysis of single channel grayscale images. RGB image data can be used as input but is automatically converted to grayscale average projections. Further major developments would be required to expand the tool to also consider colour information of objects of interest. The template matching originates from machine vision applications for the automated inspection of two-dimensional image data. Nevertheless, for certain applications it could also be used to search for the most probable XYZ positions of an object in volumetric data, provided that objects can be robustly discriminated between single z-slices.