Semi-automated quantitative Drosophila wings measurements

Background Drosophila melanogaster is an important organism used in many fields of biological research such as genetics and developmental biology. Drosophila wings have been widely used to study the genetics of development, morphometrics and evolution. Therefore there is much interest in quantifying wing structures of Drosophila. Advancement in technology has increased the ease in which images of Drosophila can be acquired. However such studies have been limited by the slow and tedious process of acquiring phenotypic data. Results We have developed a system that automatically detects and measures key points and vein segments on a Drosophila wing. Key points are detected by performing image transformations and template matching on Drosophila wing images while vein segments are detected using an Active Contour algorithm. The accuracy of our key point detection was compared against key point annotations of users. We also performed key point detection using different training data sets of Drosophila wing images. We compared our software with an existing automated image analysis system for Drosophila wings and showed that our system performs better than the state of the art. Vein segments were manually measured and compared against the measurements obtained from our system. Conclusion Our system was able to detect specific key points and vein segments from Drosophila wing images with high accuracy. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1720-y) contains supplementary material, which is available to authorized users.


Background
Recent advances in high-throughput sequencing technology have enabled us to obtain genome information from any kind of organism [1]. For humans, more than a thousand of genome sequences and single nucleotide polymorphisms among them have been characterized [2]. This further motivated genome-wide association studies (GWAS) in the literature with the hopes of discovering the variations responsible for genetic diseases and traits in the genome sequences. The strategy of investigating the relationship between genotype and phenotype should succeed in discovering the genetic basis for any inherited traits in any kinds of organisms. However, in many higher animals *Correspondence: leehk@bii.a-star.edu.sg † Equal contributors 1 Imaging Informatics Division, Bioinformatics Institute, 30 Biopolis Street, #07-01, Matrix, Singapore 138671 Singapore, Singapore 3 Research Center for Genomics and Bioinformatics, Tokyo Metropolitan University, Hachioji, Tokyo 192-0397, Japan Full list of author information is available at the end of the article other than human beings, the description of phenotype needs manpower with expertise in morphology of the animals under consideration; such a dependency of this approach on available manpower may be a critical bottleneck of GWAS. Hence, development of an automated system for describing phenotype may be one of the best solutions to this problem. Fruit flies, in particular Drosophila melanogaster is an important organism used for biological research fields particularly in genetics and developmental biology. Moreover, the complete genome sequences of this species and its related species [3] opened up an ideal opportunity for comparative genomics analyses for the systematic understanding of phenotype and genotype relationships. Species belonging to the genus Drosophila are particularly useful for this purpose because there are more than 2000 described species in the genus Drosophila [4], each of which has distinct phenotypic characters. On the other hand, this situation makes it difficult for ordinary Drosophila researchers to distinguish morphological differences among different species and only a limited number of researchers who are experts in Drosophila taxonomy can recognize key morphological characters to identify the species.
The use of wing morphometrics for species identification have been demonstrated on a variety of insect species [5][6][7][8][9], because of the simple two dimensional structures with clear-cut patterns of veins and their crossings, which are frequently utilized as landmarks to measure distances and angles among them. There is much interest in quantifying wing structures of Drosophila [10][11][12][13][14] and Drosophila wings have been widely used to study the genetics of development, morphometrics and evolution [15][16][17][18][19][20]. Advances in technology have increased the ease in which images of insect wings can be acquired. However the extraction of phenotypic information for further research and analysis is a manual and time consuming process. This has motivated the creation of several software which aims to automate this task. One common method involves using software for easy annotation of landmarks on the wing (http://life.bio.sunysb.edu/ ee/rohlf/software.html, http://www.hockerley.plus.com/). However annotating landmarks is time consuming and prone to errors [21]. More advanced software attempt to find the location of landmarks and vein lengths using a variety of image processing and optimization methods. These methods tend to involve fitting a set of bezier splines onto the wing veins. WINGMACHINE [22] is a program that automatically detects and measures the positions of veins and edges of wing blades from live flies. MorphoJ [23] performs geometric morphometrics analysis on morphological landmark inputs. Although MorphoJ does not detect and extract landmarks, it is a useful tool for studying data on shape combined with molecular genetics or ecological information. Automated systems to detect and extract information from wings from a variety of other insect species exist [24][25][26][27][28]. However only a few of these programs are able to extract specific unique key points that are used for the prediction of the fly species. Crnojevic et al. [28] trained a svm classifier based on Histogram of Oriented Gradient (HOG) and Complete Local Binary Pattern (CLBP) features for vein junction detection of hoverflies. In addition the junctions were used to construct convex hulls which were used to discriminate 4 different hoverfly species. Thus a more specific and customized software is needed to automate the detection and prediction process of Drosophila wings. As mentioned previously, the wing morphology of Drosophila species is highly favorable for automated image analysis [22]. Therefore, in this paper we introduce an automated system that locates specific keypoints on a Drosophila wing image and performs morphometrics measurement of several important vein segments. The extraction of these phenotypic data serves as a base for future studies and research of Drosophila wings such as specie prediction.
The software requires the user to annotate three specific key points on the fly wing before key points and vein segments are calculated automatically. The program provides an intuitive user interface and users can manually edit key points and vein segment lengths if the prediction is inaccurate. In this contribution, we report the performance of our program and perform a comparison against WING-MACHINE [22] an existing program that automates the measurement of Drosophila wings.

Data set
A total of 959 flies from 16 (Table 1). A one-sided wing was taken from each individual and placed in 8 μl mounting solution (ethanol : glycerol = 2 : 3) dropped on a glass slide. After a coverslip was placed, the slide was placed under a stereomicroscope (Nikon SMZ-10A) and the wing photograph was taken by a digital camera (Canon EOS 40D), where the original image resolutions was 1936×1288, 2816×1880, or 3888×2592 pixels. A total of 600 images were used for image processing. Figure 1 shows the arc-lengths to be measured as well as the mean and standard deviation of the pixel intensities of all 600 images. To conserve memory usage and improve processing speed, small image patches around the manually annotated key points were cropped and stored in small image patch files. In practise, only image patches near the key point locations were needed for key point detection. All images are also converted to gray scale images. Let annotated images be denoted by D m , where m = 1 · · · N and N = 600 is the total number of images.

Key point detection
To identify a species from the wing image 13 key points are needed. The key points are labelled 'a' to 'm' and can be found in Fig. 1. Key point 'k' is the boundary of the costal fringe and the others are intersections of veins. Given a new, unannotated fly wing image I, key points detection is carried out in two steps. The first step aligns the annotated images D m and the input image I. In the next step, key points are detected using template matching.
To align the input image I with D m , the user is required to click the location of the three key points 'a' , 'h' and 'k' on I (see Fig. 1). A screen shot of the software can be found in Fig. 2. An affine transformation matrix Q m is then calculated by matching the key points 'a' , 'h' and 'k' in I and D m . Hence, each image in the data set D m is associated with one affine transformation Q m , m = 1, · · · N. This transformation matrix is then used to transform all images in the data set from D m to D m . Figure 3 shows an example of transforming D m to D m . In our implementation, only pixels within a small image patch near the annotated key points of D m are transformed. This process greatly increase the processing speed. As an affine transformation is used to align the images, the software is capable of handling fly wings of any orientation.

Template matching
Image patches between I and D m are matched and the best match locations are predicted to be the key point locations. Define w I (x, y, θ) to be an image patch centred at (x, y), oriented at an angle θ extracted from the image I. Similarly, let w D m (x, y, θ) be an image patch extracted from the image D m . The objective is to find the best match between w I and w D m by adjusting the image patch parameters m, x, y, θ.
We shall describe the template matching procedure using key point j as illustration. The procedure for matching other key points is similar. The location of key point j in D m , x m j , y m j is known since D m has been manually annotated. The transformed location x m j , y m j can be calculated using Q m . Five templates w D m x m j , y m j , θ are constructed with θ = −10, −5, 0, 5, 10 degrees. Fig. 4c,  examples of these five templates. Next, we want to match these templates to a target location w I x m j , y m j , 0 in image I. One last step before the matching process is to normalize the pixel intensities within the patches by performing a linear transformation to make the range of pixel intensities within the image patch between 0 and 255.
Matching is done by sliding the windoww D m x m j , y m j , θ with respect tow I x m j , y m j , 0 .w D m x m j , y m j , θ and w I x m j , y m j , 0 are the normalized image patches. Figure 5 illustrates the matching process. The matching score is given by, c x , c y is the shift of the center locations between the image patches, with −15 ≤ c x , c y ≤ 15. · is the Euclidean norm. Finally, the predicted key point location corresponds to the best match among all shifts, orientations and template images D m .
and the key point j's coordinate is, The above process is repeated for the remaining key points on I. The details of the above algorithm can be found in Algorithm 1 KEYPOINT-DETECTION in Additional file 1.

Arc length calculations
Tracing the arcs as depicted in Fig. 1 is done using active contours while the Euclidean distances between key points are calculated trivially. In the active contour method, an estimate of the desired curve is first generated using a template. The curve is then evolved using a variational principle via a gradient descend method. Two essential ingredients for applying the active contour method are estimation of the initial curve and the existent a b c

Initialization of active contours
The initialization of active contour curves is critical for finding the fly wing vein accurately. If the initial curve lies far away from the vein, active contour might evolve this arc to another vein.
For example, setting the initial curves to straight lines works well for arcs fg, gh, jl and kl as these arcs are relatively straight. This is not the case for arc ml as there is another vein just below it. Active contour will tend to evolve this curve towards the vein below ml.
A good initialization of curves can be obtained by mapping veins of a template fly wing image to the input image I. This template wing should provide a good representation of wings for all different fly wing species. This mapping is done by aligning the template fly wing image with the input image. The alignment process has been described in the key point detection section.  Figure 6 illustrates this concept. First a fly wing image is chosen to be the template wing image (Fig. 6a). We have selected our template wing such that key point k is equidistant to both key point j and l. Mapped arcs will be warped badly if key point k is not equidistant to both key points j and l. Different arcs are drawn using different colours in Fig. 6 for clarity. Arc ml is drawn outside of the template wing due to two reasons. The first is that the arc ml of the template wing that we have chosen is quite flat and does not provide a suitable representation of other fly wings' ml arcs, which have more curvature. Secondly, it can be noticed that there is another vein just below arc ml that lies close to arc ml. Active Contour might evolve the mapped arc to the wrong vein if the mapped arc happens to lie closer to that vein and not arc ml. To prevent this, the arc ml is drawn outside the template wing image so that the mapped arc also lies outside the fly wing. Some results of this mapping are shown in Fig. 6b -d. One can now notice how the mapped arc ml of Fig. 6c lies outside the wing, rather than in between the two veins. Figure 7 shows how the input image is processed to produce a gradient for active contour optimization. The main idea is to make the veins' pixels very bright and for them to be surrounded by pixels that become darker as they become further away from the veins. A step-by-step work flow of how the preprocessed image is obtained is shown in Fig. 7a. The original image (Fig. 7b) is blurred and inverted to obtain an image where the veins' pixels are very bright. The result is shown in Fig. 7c. A second image is needed to help produce a smooth downwards gradient away from the vein. Edges (Fig. 7d) are found on the blurred image before applying a threshold to obtain a binary image (Fig. 7e). The binary image is dilated to make the fly wing veins thicker before inverting the image (Fig. 7f). A distance transform (Fig. 7g) is applied before inverting the image one last time (Fig. 7h). The two images ( Fig. 7c and h) are then added pixel-wise to produce the preprocessed image P (Fig. 7i). This process is integrated in the software and is done automatically before performing Active Contour.

Active contour formalism
Given an image P, the arc length between 2 key points can be found by finding the path that follows high intensity pixels between the two points. Curves are also represented by connecting short straight line segments. Given 2 key points p a = (x a , y a ) and An objective function for the active contour can then be defined as, that f is small along paths of high image intensity. To make the active contours invariant with respect to fluctuations of the overall image intensity, we make f to be invariant to image intensity multiplication, i.e. f (x 0 , . . .y n , P) = f (x 0 , . . .y n , aP). Hence f can be defined as: P is the normalization on the image intensity such that f becomes invariant under image intensity multiplication. 0 < 1 is a small regularizer to prevent numerical overflow. The desired curve can be found by minimizing the objective function: using gradient descend method, n−1 (8) Finally, the arc length can then be calculated by: The details of the algorithm can be found in Algorithm 2 ARC-LENGTH in Additional file 1.

Species identification from fly wing images
The Drosophila species can be identified from the wing image using 13 key points (Fig. 1). The centroid of wing was estimated as the centroid of the octagon whose vertices were the key points 'a' , 'h' , 'i' , 'j' , 'k' , 'l' , 'm' and 'b' . The centroid of the octagon was computed from the areaweighted average of the centroids of the five triangles split from the octagon. Then, Euclidean distances from the wing centroid to the 13 key points were measured and normalized by dividing by the distance between 'a' and 'j' (wing length).
In addition to the normalized distances from 13 key points to the centroid, we measured three traditional wing indices, costal-index, c3 fringe and 5x-index [29][30][31] to identify the species from a wing image. The costal-index is defined as the ratio of the length of the vein l-m to that of the vein j-l. The C3 fringe is the ratio of the length of the vein k-l to that of the vein j-l. The 5X-index is the ratio of the length of the vein g-h to that of the vein f-g. Because arc lengths of wing veins were required to compute these indices, we used a fitting of veins to Bezier paths. As a result, for each wing, we obtained 16 variables to construct the training dataset and measured their values for 589 flies from 15 species excluding Microdrosophila sp., which was used as a sample for a negative control as will be explained later.
With this size of the training dataset, however, utilization of all of the 16 variables did not show the best performance in the species identification in our preliminary test, where the data from 588 files were used to construct the training dataset and the remaining data from one fly was used as the test data. Therefore, we examined the conditions that may give the best result among all the possible numbers (2-16) and possible combinations of these numbers of variables (65519 patterns in total) and found that the best result was obtained when six variables, the distances from the key points e, f, g and m to the centroid, costal-index and c3-fringe, were employed. Accordingly, we used only these six variables for the following discriminant analysis.
The mean vector and the variance and covariance matrix of the six variables were computed for the 15 species and for male and female separately excluding Drosophila hydei female and Scaptodrosophila coracina female due to a small sample size. Therefore, we obtained the vectors for 28 groups in total. Then, the squared Mahalanobis' distance, D 2 , [32] from each wing to each group was computed. Following De Maesschalck et al.
denotes the vector of j-th wing,μ i andŜ i denote the mean vector and the variance and covariance matrix for i-th group, with the aid of the R package ( [34]). For each wing image from unknown sample, the D 2 values were computed withμ i andŜ i for the 28 groups and the group that gave the smallest D 2 value was identified to be the species of the sample. The statistical significance of the goodness of fit to the inferred group was obtained by the chi-square test with n degrees of freedom. When the smallest D 2 value was larger than the expected value for the best-fit species at the 0.1% level, the sample's species was identified as 'unknown' .
To examine the efficiency of species identification by the discriminant analysis, we used 370 additional wing images not included in the training data from 15 species including Microdrosophila sp., which were absent in the training data and thus expected to be identified as 'unknown' species if the method worked correctly.

Evaluation metric of key point detection
The key points of each fly wing in the data set is found using the methods described above. To test the accuracy of the key point detection algorithm, 15 fly wing images, one from each species, is manually annotated by 10 users. This give us 10 X,Y coordinates for each key point for each of the 15 fly wing images. The covariance matrix, which represents the spread of the 10 annotated points for a single key point, can be obtained from the manually annotated data. These covariances matrices are then used for benchmarking the accuracy of the automated methods.
A total of 15 covariance matrices, 1 for each annotated image, are obtained for a single key point. The 15 covariance matrices are summed and averaged to obtain the final covariance matrix for a single key point. This process is done for all key points. Table 2 shows the covariance matrix obtained for each key point. One might notice that the covariance matrix of key points 'a' , 'b' , 'c' , 'k' and 'm' contains larger values. This indicates that the manual annotations of these key points have a large spread and it is not easy even for humans to locate these key points consistently.

Key point detection results
Some results of Key Point Detection are shown in Fig. 8. To determine if a predicted key point is accurate, we check if the euclidean distance (in pixels) between the predicted key point and the ground truth is less than some threshold. The threshold used in our experiment is 2 standard deviations away from the corresponding key point's covariance matrix. The results of this experiment are summarized in Table 3. The mean deviation in pixels for each point is also reported. Accuracy for key points 'a' , 'h' and 'k' are not reported as these points were marked by the user. The mean pixel deviation is also slightly higher for key points 'b' , 'c' , 'l' and m' . These key points are located on slightly thicker fly wing veins as compared to other key points. A thicker vein would increase the candidate area for a key point and this explains the larger values in pixel deviations.
Different sized data sets were used to predict key points. For the data set of size 100, 100 images had key points manually annotated and used for template matching. The remaining 500 images were used to test the accuracy of the algorithm. Cross validations was also done for this data set. Similarly, for the data set of size 599, 599 images had key points manually annotated and the remaining image was used for testing the algorithm. We expect that the data set of size 599 will yield better results. The accuracy results are shown in Table 3. We also downsized our images further and repeated the above experiments. The results of the downsized dataset can be found in Table 4.
We also compared our method against an existing fly wing fitting software, WINGMACHINE [22]. WINGMA-CHINE first detects curves in a fly wing image that fit a priori template before fitting spline curves to the image. The location of two landmarks -the humeral break and alula notch, need to be known before curve fitting can take place. We ran the Wings software on ten images in our dataset. The results are shown in Fig. 8g -i and Table 5. WINGMACHINE tries to find the fly wing veins using edge detection methods and results were not so ideal in cases where there were background artefacts. Moreover, the edge detection algorithm sometimes fail to pick up some veins that are too faint. This can lead to a wrong prediction on the joint locations which results in a bad fit for the spline curves.

Arc length results
Some examples of good results of Active Contour are shown in Fig. 8. Note that even though there are dark patches (Fig. 8d), background artefacts (Fig. 8e) or spots on the wing (Fig. 8f), Active Contour is still able to find the fly wing vein accurately most of the time.
We found a total of 4 bad results out of 600 images in our dataset. Users are allowed to edit bad results of arc lengths in the fly wing software. Some examples of bad results of Active Contour are also displayed in Fig. 8. As can been seen in Fig. 8j and k, air bubbles are present in the background while dust is present in Fig. 8l. It can be observed that only the location of arc lm is incorrect and that artefacts in these images lie above arc lm. It has been mentioned previously that the arc lm is initialized above the fly wing. These artefacts in the background produces large peaks in the active contour gradient image, resulting in Active Contour getting stuck in the local maxima of the artefact. Other arcs do not suffer from this problem as these arcs are initialized onto the foreground which has little artefacts. Thus only artefacts that lie just above the fly wing affects the accuracy of Active Contour.
The accuracy of our Active Contour method was also tested. Arc lengths ground truth of fifteen different fly wings, one for each species, are annotated manually and compared with the arc lengths found using Active Contour. The error is calculated as the area bounded by the annotated arc and the predicted arc. The error is then divided by the length of the annotated arc. The results are tabulated in Table 6. This is normal as a longer arc spans a larger area of the image and this increases the amount of noise and background artefact that might affect the Active Contour algorithm.

Species identification results
The efficiency of the wing image analysis for species identification is shown in Table 7. Out of 370 flies examined, 346 flies were identified as the correct species (success rate was 94%), whereas the capability of identifying correct sex was poor (success rate was 56%). This suggests that the wing morphology is not significantly different between female and male, whereas the between-species difference is significantly larger than the within-species difference among all of the 15 species examined. In addition, it is noteworthy that the present system perfectly recognized Microdrosophila sp., which was not included in the training data, as 'unknown' species. This suggests that the present system has a discrimination power to give an appropriate answer even when the species of sample is not included in the training data. Therefore, the present system is expected to be useful to analyze wing morphology for species identification among species belonging to Drosophila and related genera, even after more data for more species are added. shows arc lengths found using our method. The colours red, blue, orange, green and magenta represents the arc segments fg, gh, jl, kl, lm respectively. g -i shows key point and arc length predicted for the same images using the WINGMACHINE software. Red arrows are used to indicate the errors for clarity. The errors in g and i are due to incorrect prediction of joint locations while the error in h is due to a bad fitting results of an a priori template wing. Lastly, j -l shows three examples of bad arc length estimation results found using Active Contour due to artefacts in the background such as air bubbles and dust particles. The red arrows show the artefacts that affect the Active contour algorithm. There are a total of four images with bad arc length estimation out of our dataset of 600 images

Discussion
Our software has managed to successfully predict landmarks on up to 15 different species of fly wing images. Incorrect landmarks detected can be corrected using the software user interface with ease. In addition, our software is able to accurately calculate and measure specific vein segments which are required for fly specie detection. Incorrect vein segments found can also be corrected using the user interface of the software. In contrast to WINGMACHINE software, which makes use of high level features to optimize fit of an a priori model of a fly wing, our software utilizes low level features such as edges and pixel values to find land marks and vein segments. As mentioned by WINGMACHINE, developing detection algorithms using low levels features tend to encounter problems such as varying thickness of veins, uneven lighting and background artefacts which cause inaccuracies during prediction. However, our software manages to overcome the majority of these difficulties and performs well for images in our dataset.
One advantage of using WINGMACHINE over our software is that only two land marks are needed for detection instead of three. However, it should be noted that WINGMACHINE software may require pretreatment of image data and this may require more manpower as compared to our software. The advantage of using MorphoJ is that phenotypic data can be easily read and analysed within the software. However, MorphoJ does not have the capabilities to detect and extract these information from images automatically.
There are a few drawbacks for our algorithm. The first is the need for a large dataset. As demonstrated in the key point detection results section, a large dataset containing fly wing images of numerous fly species would help increase the accuracy of our algorithm. This results in a slow detection process as each image in the dataset has to be loaded. To reduce the time taken for the detection process, image patches centered on the fly wing's key points are cropped from the original fly wing image and saved in a separate image. This separate image is loaded

Conclusion
In this paper, we have demonstrated how to locate Drosophila wing key points and obtain arc lengths of the wing veins using image processing methods. Our results show that our method is accurate and we compared our The results of using different a dataset of size 599 and 100 are shown as well. As shown in the table, using a larger dataset yields a more accurate result. Also note that key points a, h, k are excluded as these key points are annotated by the users and are thus not being predicted   Table 7 Efficiency of species identification using the wing image analysis