- Open Access
MetaSel: a metaphase selection tool using a Gaussian-based classification technique
BMC Bioinformatics volume 14, Article number: S13 (2013)
Identification of good metaphase spreads is an important step in chromosome analysis for identifying individuals with genetic disorders. The process of finding suitable metaphase chromosomes for accurate clinical analysis is, however, very time consuming since they are selected manually. The selection of suitable metaphase chromosome spreads thus represents a major bottleneck for conventional cytogenetic analysis. Although many algorithms have been developed for karyotyping, none have adequately addressed the critical bottleneck of selecting suitable chromosome spreads. In this paper, we present a software tool that uses a simple rule-based system to efficiently identify metaphase spreads suitable for karyotyping.
The chromosome shapes can be classified by the software into four main classes. The first and the second classes refer to individual chromosomes with straight and skewed shapes, respectively. The third class is characterized as those chromosomes with overlapping bodies and the fourth class is for the non-chromosome objects. Good metaphase spreads should largely contain chromosomes of the first and the second classes, while the third class should be kept minimal. Several image parameters were examined and used for creating rule-based classification. The threshold value for each parameter is determined using a statistical model. We observed that the Gaussian model can represent the empirical probability density function of the parameters and, hence, the threshold value can be easily determined. The proposed rules can efficiently and accurately classify the individual chromosome with > 90% accuracy.
The software tool, termed MetaSel, was developed. Using the Gaussian-based rules, the tool can be used to quickly rank hundreds of chromosome spread images so as to assist cytogeneticists to perform karyotyping effectively. Furthermore, MetaSel offers an intuitive, yet comprehensive, workflow to assist karyotyping, including tools for editing chromosome (split, merge and fix) and a karyotyping editor (moving, rotating, and pairing homologous chromosomes). The program can be freely downloaded from "http://www4a.biotec.or.th/GI/tools/metasel".
In cytogenetic studies, abnormalities in chromosome structure are examined by microscopy. Each human cell normally has 23 pairs of chromosomes, consisting of 22 pairs of autosomes and one pair of sex chromosomes [1, 2]. Cytogenetic abnormalities are manifested as extra or fewer chromosomes than normal, e.g., having three copies of chromosome 21 in Down's syndrome, one of the most common abnormalities. Cytogenetic testing for abnormalities requires high-quality metaphase chromosome images, which are selected and sorted as shown in Figure 1.
In order to obtain enough analyzable metaphase spread images, at least 8 to 10 glass slide specimens have to be prepared for each individual. Each glass slide typically contains about 10-20 metaphase spreads. From the total of approximately 200 prepared metaphases, approximately 20 of the "best" (based on the subjective opinion of an experienced cytogeneticist) metaphase spreads are selected for karyotyping .
The consistency of chromosome numbers, i.e. total chromosome complement of each cell, is commonly determined by visual inspection among these top twenty metaphase spreads. Once the chromosome complement is verified, generally two to five of the "sharpest" images are chosen for chromosome banding analysis for detecting chromosome band abnormalities. Each step in this process is time consuming and requires experienced cytogeneticists to operate. Thus, considerable effort has been made to develop automated chromosome image analysis tools to expedite this procedure.
Each metaphase spread contains not only chromosome images but also some cell preparation artifacts [1–5]. These non-chromosome residues can be eliminated by visual inspection. However, in order to obtain an accurate karyotyping result, the metaphase spread must contain a large number of analyzable chromosomes, i.e., with clear banding patterns not obscured by overlapping chromosomes. Previous research efforts have mainly focused on segmentation of overlapping chromosomes [1, 6, 7]. However, when overlapping chromosome images are segmented, the regions of chromosome overlap are ambiguous, which could potentially lead to an inaccurate diagnosis. Therefore, getting clean metaphase spreads with well-separated individual chromosomes is preferable.
Other earlier studies on chromosome analysis have concentrated on automatic karyotyping which attempts to order and classify the chromosomes into 22 pairs of autosomes and the two sex chromosomes. Automatic karyotyping requires very informative features, such as band profiles, centromere positions, chromosome dimensions, etc. Automatic karyotyping is based on the assumption that the input contains analyzable metaphases. Numerous algorithms have been proposed to facilitate automatic karyotyping [4–7]. A recent technique proposed by Moallem et al.  used dark paths between chromosomes for classifying touching and overlapping chromosomes from good metaphase images. Khan et al.  presented a technique to geometrically correct deformed chromosomes so that the chromosomes can be karyotyped correctly. Jahani et al  focused on classification by identifying chromosome centromeres and their corresponding length.
To perform automatic karyotyping, hundreds of images must be manually examined in order to select spreads comprising mostly metaphase chromosomes for further analysis. The goal is thus to select the best metaphase spreads with clearly separated individual chromosomes for karyotyping. The selection of good, metaphase spreads is very time consuming, perhaps requiring hours of expert inspection of hundreds of specimens. Thus, the cytogeneticist will normally select approximately 20 of the first good metaphase spreads that he/she has encountered, instead of examining all metaphase spreads from all specimen slides. Hence, this arbitrary approach may exclude better metaphase spreads, and so lead to sub-optimal results. There is thus a need for a more thorough and efficient method of selecting good metaphase spreads for karyotyping. Although some techniques have been proposed for automatic metaphase selection, in practice these techniques are impractical for processing hundreds of images in a typical cytogenetic analysis owing to the high computational complexity [1, 2, 3,, 5]; [13–15].
To our knowledge, there are only two works that have addressed the problem of improving the efficiency of automated metaphase selection. The first study  concentrated on rapid identification of metaphase, but did not assess metaphase quality, i.e. the selection of analyzable versus non-analyzable metaphase. The second approach in  utilizes skeletal analysis of chromosome images in order to estimate the number of analyzable chromosomes; hence, it can quickly select a few good metaphase spreads in terms of quality. However, the time to process each image can take up to 5 minutes, which is still not practical when dealing with a large number (>100) of images.
To address the aforementioned problems, this work presents a rapid, practical chromosome classification tool for identification of good metaphase spreads based on rule-based classification. The software, called MetaSel, is the first attempt to offer a free assistive karyotyping tool for chromosome analysis. The software employs a heuristic that first defines important image parameters for chromosome feature extraction and then constructs rules for chromosome classification.
Materials and methods
Specimens for cytogenetic testing were obtained by a standard clinical procedure at the Rajanukul Institute, Ministry of Public Health, Bangkok. In brief, cells from amniocentesis samples from pregnant women were applied to glass slides and stained with Giemsa. Chromosome images were obtained by microscopy using the Zeiss Axioskop2 model. A metaphase spread contains some individual chromosomes as well as other chromosomes that may not be well spread out, i.e., overlapping or touching. We defined objects from the metaphase spreads into four classes (Figure 2). The first three classes are in fact the underlying chromosomes whereas Class-4 is considered as residues or artifacts, e.g., cell debris. Individual chromosomes from Class-1 and Class-2 can be distinguished by their straightness. Chromosomes from both classes must be individually separable. Hence, Class-1 is defined as straight individual chromosome, while Class-2 is defined as skewed or bended individual chromosome. Chromosomes from Class-3 comprise other non-individual chromosomes that may be overlapping or touching with other chromosomes in the vicinity.
First an image is enhanced by using the histogram equalization threshold as described in [10, 11] for adjusting the gray level in the image. Then, we attempted to separate the real chromosome image from its background. This process is called image segmentation in image processing . In order to do the segmentation, we adopted the Otsu's automatic threshold technique  to isolate the chromosome image from the background.
We performed image segmentation and rotated the resulting objects into their vertical orientation in order to classify segmented objects from metaphase spreads. The image parameters, namely width, height, and estimated area ratio, are extracted from the rotated images. The width and height parameters of each chromosome segment are the important factors used to quickly characterize the chromosomal objects into the four classifications. In particular, the area ratio can be defined as:
where A r is the number of pixels inside the smallest enclosing rectangle (W rect ×H rect ) of the segmented object and A o is the number of pixels of the segmented object. Figure 3 shows image parameters for chromosome image classification, where W rect and H rect are the width and the height of the minimum rectangle of segmented objects in pixel unit.
The area ratio quantifies the amount of the actual object pixels per the pixels inside the rectangle box demarcating the object. This ratio can be effectively used to classify the straightness of the chromosome. We verified this ratio by performing statistical analysis of randomly chosen chromosome area ratios from 822 straight and 1012 touching/overlapping (including skewed objects) chromosomes. The empirical probability density function was estimated using the kernel density method (Figure 4). Gaussian model was used to determine the threshold value of the area ratio for classification. When the area ratio is greater than 67.84%, the chromosome can be classified as Class-1 (straight objects). However, this class may contain some non-chromosome residues that need to be excluded.
Since the width of Class-1 chromosomes should be consistent, deviation from their average width is considered as residual objects. To detect these remainders, we first determine the total average width of all objects with the area ratio > 67.84%. If the object width is greater than 1.5 times of the total average width, such an object will be discarded. Let Ow represent the set of objects with the underlying width less than 1.5 times of the total average width. The chromosome width of each object (W) in the set Ow can be defined as:
Then, the average width is defined as:
To quantify the deviation from the average width, we define the rectangle width ratio as:
Clearly, the deviation from the unity of Wrect ratio entails differences in terms of the quality of chromosome straightness. Thus, the threshold value of the rectangle width ratio for Class-1 is determined by the probability distribution of Wrect ratio . The experimental studies of this ratio were performed using 222, 327 and 500 samples of small, large residual objects and straight individual chromosomes respectively. The empirical and Gaussian probability density functions of Wrect ratio are depicted in Figure 5. When 0.9897 ≤ Wrect ratio ≤ 1.5597, the corresponding object will be classified as straight individual chromosome (Class-1). When Wrect ratio <0.9897, the chromosome object will be classified as a small non-chromosome residue (Class-4). Moreover, the object can be classified as Class-4 when Wrect ratio > 1.5597, i.e., being a large object.
When Wrect ratio < 67.84%, the corresponding object can be classified as either skewed individual chromosome or touching/overlapping chromosome. To distinguish between skewed objects and non-chromosome residues, the height of segmented object is defined as:
The ratio between H i and H rect , height ratio (Hi ratio ), is computed by.
We observed 600 skewed objects and overlapping chromosomes as well as 70 non-chromosome residues. The statistical analysis was performed to determine the threshold value of the height ratio for screening out unwanted residual objects. Figure 6 presents the empirical probability density function of the height ratio which can be approximated by the Gaussian model. Using this model, chromosome objects will be classified as "residual" when Hi ratio < 0.7507. When Hiratio ≥ 0.7507, the objects will be classified as mixing between skewed objects and touching/overlapping chromosomes.
To separate skewed objects from those touching/overlapping chromosomes, one additional parameter must be used. It can be observed that the width of an overlapping chromosome will be larger than the width of a skewed individual. This parameter, called maximum width ratio (W max ratio ), therefore, can be computed by using the maximum object width in pixels (W max) and the average width (Wavg):
The threshold to separate skewed chromosome individuals from overlapping chromosomes was determined by using statistical analysis. The empirical probability density functions of skewed individuals and overlapping chromosomes were determined using 593 and 393 samples respectively. The Gaussian model was used to approximate the empirical model for threshold calculation. The threshold for separating skewed individuals and overlapping chromosomes was chosen to be the intercept of the two Gaussian curves (2.3453) as shown in Figure 7. In other words, the objects will be classified as overlapping chromosomes when W max ratio is greater than this selected threshold. When W max ratio is less than or equal to the threshold, objects will be classified as skewed individuals. Figure 8 summarizes image parameters (see flowchart in panel A) and the proposed rule-based algorithm (see panel B) to classify chromosome images.
Implementation of MetaSel
The proposed rule-based classification for metaphase selection was implemented in C# with OpenCV library. This classification module was incorporated into our karyotyping software tool, called MetaSel, which was written from scratch using C# on Microsoft Windows 7 operating system. Based on the decision rules presented in Figure 8, the workflow of this tool can be described as follows:
Open a project folder, which contains metaphase spread images (Figure 9).
Performing metaphase analysis by using the proposed classification rule (Figure 10).
The metaphase images will be grouped into four classes and ranked according to their total number of individual chromosomes, which is calculated by combining the number of objects in Class-1 and Class-2 (Figure 11).
Users choose which metaphase spread image to perform karyotyping. The higher rank generally refers to better quality (analyzable) of the spread. In case of a tie, users are strongly advised to choose the image that contains more objects in Class-3. If the number of objects in Class-3 is equal for the tie images, the number of object in Class-4 (smaller is better) should be used to break the tie.
After choosing the metaphase spread image, MetaSel will line up the individual chromosomes from Class-1, and Class-2 (Figure 12). Users can select good metaphase images to later perform karyotyping.
Users can go back to the original image to edit the ambiguous chromosome images (touching/overlapping objects) by cutting, merging, or fixing (make a correction on the contour line of a chromosome image), the images so that they can be karyotyped as described in the previous step. (Figure 13)
Two hundred metaphase spreads were used to determine the accuracy of the proposed rules. From these 192 metaphase images, 7817-segmented objects were obtained. The processing time for 192 metaphase images was 35.52 seconds and, hence, the average processing time for each image was approximately 0.185 seconds. The accuracy of this classification rule is shown in Table 1. We observe that only 0.58% of Class-1 was misclassified into Class-4. This classification error occurs due to residual objects that come with straight shape. Skewed individuals (Class-2) were misclassified as overlapping chromosomes (Class-3) or residual objects (Class-4). The accuracy of skewed individuals (Class-2) classification was 90.67%. Some of class-2 objects were classified into Class-3 and Class-4. This is because some overlapping chromosome arrangements were similar to the banding shape and some medium size residual objects. The accuracy of overlapping chromosomes (Class-3) classification is 89.44%. Some overlapping chromosomes are misclassified into Class-1, Class-2, and Class-3 since the random arrangements of overlapping pattern may resemble those classes. The rule gives very high accuracy (93.25%) of non-chromosome objects (Class-4) classification. There are only few percent of Class-4 misclassification.
This work presents a method for chromosome classification using key chromosomal image parameters. We found that the area ratio, the rectangle width ratio, the chromosome width ratio, maximum width ratio and height ratio can be used to efficiently classify chromosome objects into four classes. From our experiments, the accuracy of individual with straight shape and skewed individual chromosomes were 99.42% and 90.67% respectively. This study demonstrated that Class-1 and Class-2 of chromosomal images can be used to efficiently and accurately determine quality of the metaphase images. In other words, these classes of chromosome can be utilized to identify analyzable metaphase spreads. The processing time of chromosome classification is crucial for automated systems since the systems need to process large number of images in order to correctly diagnosis a patient. Consequently, chromosome counting, e.g., Down's syndrome screening can greatly benefit from our proposed chromosome classification. In the future, we planned to integrate existing automatic karyotyping algorithms and other chromosome analysis modules, e.g., numerical and structural abnormally detection. The current metaphase selection module was implemented and used in the MetaSel program. Both software (for Windows XP or 7 only) and user manual can be freely downloaded from our website, http://www4a.biotec.or.th/GI/tools/metasel.
Availability of supporting data
The user manual of the software and some samples of chromosome images supporting the results of this article are available on our website, http://www4a.biotec.or.th/GI/tools/metasel
Wang X, Li S, Liu H, Wood M, Chen W, Zheng B: Automated identification of analyzable metaphase chromosomes depicted on microscopic digital images. Journal of biomedical informatics. 2008, 41 (2): 264-271. 10.1016/j.jbi.2007.06.008.
Castleman Kenneth R: The PSI Automatic MetaphaseFinder. Journal of Radiation Research. 1992, 33 (Suppl 1): 124-128.
Huber R, Kulka U, Lörch T, Braselmann H, Bauchinger M: Automated metaphase finding: an assessment of the efficiency of the METAFER2 system in a routine mutagenicity assay. Mutation Research/ Environmental Mutagenesis and Related Subjects. 1995, 334: (1) 97-102. 10.1016/0165-1161(95)90011-X.
Popescu M, Gader P, Keller J, Klein C, Stanley J, Caldwell C: Automatic karyotyping of metaphase cells with overlapping chromosomes. Computers in biology and medicine. 1999, 29 (1): 61-82. 10.1016/S0010-4825(98)00040-7.
Wang X, Zheng B, Li S, Mulvihill J, Wood C, Liu H: Automated classification of metaphase chromosomes: optimization of an adaptive computerized scheme". Journal of biomedical informatics. 2009, 42 (1): 22-31. 10.1016/j.jbi.2008.05.004.
Vliet LJ, Young IT, Mayall BH: The Athena semi-automated karyotyping system". Cytometry. 1990, 11 (1): 51-58. 10.1002/cyto.990110107.
Ritter G, Gallegos MT, Gaggermeier K: Automatic context-sensitive karyotyping of human chromosome based on elliptically symmetric statistical distributions". Pattern Recognition. 1995, 28 (6): 823-831. 10.1016/0031-3203(94)00162-F.
Liu D, Yu J: Otsu method and K-means. Proceedings of the Ninth International Conference on Hybrid Intelligent Systems (HIS 2009). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2009, Shenyang, China, 344-349. 12-14 August 2009
Gajendran V, Rodríguez J: Chromosome counting via digital image analysis. Proceedings of the 2004 International Conference on Image Processing (ICIP 2004). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2004, Singapore, 2929-2932. 24-27 October 2004
Wayalun P, Chomphuwiset P, Laopracha N, Wanchanthuek P: Images Enhancement of G-band Chromosome Using histogram equalization, OTSU thresholding, morphological dilation and flood fill techniques. Proceedings of the 2012 8th International Conference on Computing and Networking Technology (ICCNT 2012). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2012, Gyeongju, South Korea, 163-168. 27-29 August 2012 2004
Wenzhong Y: A Counting Algorithm for Overlapped Chromosomes. 3rd International Conference on Bioinformatics and Biomedical Engineering (iCBBE 2009). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2009, Beijing, China, 1-3. 11-13 June 2009
Kovács G, Kajtár B, Méhes G, Fazekas A: Fast detection of chromosome metaphases in digitalized microscopic slides. International Conference on Image Analysis and Signal Processing (IASP). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2009, Linhai, China, 444-448. 11-12 April 2009
Van den Berg HTCM, De France HF, Habbema JDF, Raatgever JW: Automated selection of metaphase cells by quality. Cytometry. 1981, 1 (6): 363-368. 10.1002/cyto.990010602.
Shippey G, Carothers AD, Gordon J: Operation and performance of an automatic metaphase finder based on the MRC fast interval processor. Journal of Histochemistry & Cytochemistry. 1986, 34 (10): 1245-1252. 10.1177/34.10.3755736.
Huber R, Kulka U, Lörch T, Braselmann H, Bauchinger M: Automated metaphase finding: an assessment of the efficiency of the METAFER2 system in a routine mutagenicity assay. Mutation Research/Environmental Mutagenesis and Related Subjects. 1995, 334 (1): 97-102. 10.1016/0165-1161(95)90035-7.
Gonzalez RC, Woods RE: Digital Image Processing. 2002, New Jersey: Prentice-Hall
Moallem P, Karimizadeh A, Yazdchi M: Using Shape Information and Dark Paths for Automatic Recognition of Touching and Overlapping Chromosomes in G-Band Images. International Journal of Image, Graphics and Signal Processing (IJIGSP). 2013, 5 (5): 22-28. 10.5815/ijigsp.2013.05.03.
Khan S, DSouza A, Sanches J, Ventura R: Geometric correction of deformed chromosomes for automatic Karyotyping. Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2012). Edited by: Institute of Electrical and Electronics Engineers ( IEEE ). 2012, San Diego, California, USA, 4438-4441. 28 August - 1 September 2012
Jahani S, Setarehdan SK: Centromere and Length Detection in Artificially Straightened Highly Curved Human Chromosomes. International Journal of Biological Engineering. 2012, 2 (5): 56-61. 10.5923/j.ijbe.20120205.04.
The authors would like to thank the research team from the Center for Medical Genetics Research: Dr. Verayuth Praphanphoj, Ms. Sukanya Meesa, Ms. Nasikarn Maungkhom and Ms. Saranporn Satjabundarnjai for providing the metaphase images used in this study. The improvement of this work was truly done via feedbacks and critical comments from our research colleagues: Dr. Chanin Limwongse and his cytogenetic team from Siriraj Hospital, Dr. Suparerk Manitpornsut and his team from the university of Thai Chamber of Commerce. Finally, this work was supported by the Thailand Research Fund (TRF) under Project no. RSA5480026 and the Research Chair Grants 2011 from the National Science and Technology Development Agency (NSTDA), Thailand.
The Publication of this article was funded by TRF grant no. RSA5480026 and the funding from National Center for Genetic Engineering and Biotechnology (BIOTEC).
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 16, 2013: Twelfth International Conference on Bioinformatics (InCoB2013): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S16.
The authors declare that they have no competing interests.
RU carried out the implementation of the MetaSel program, participated in the design of the proposed algorithm. AI, SK and ST analyzed the results and revised the draft of metaphase selection algorithm. PY, RP and AA participated in designing the user interface of the MetaSel program. RU and AI performed experiments and statistical analysis of this work. SK and ST drafted the manuscript. ST conceived the original idea and supervised the production of this work.
About this article
Cite this article
Uttamatanin, R., Yuvapoositanon, P., Intarapanich, A. et al. MetaSel: a metaphase selection tool using a Gaussian-based classification technique. BMC Bioinformatics 14, S13 (2013) doi:10.1186/1471-2105-14-S16-S13
- Karyotype software
- Metaphase selection
- Metaphase spread
- Rule-based classification
- Gaussian model