A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals
© Palaniappan et al.; licensee BioMed Central Ltd. 2014
Received: 4 December 2013
Accepted: 18 June 2014
Published: 27 June 2014
Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database.
The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively.
Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.
KeywordsRespiratory sounds MFCC One way ANOVA Support vector machine K-nearest neighbour Confusion matrix
Auscultation is the process of listening to the internal sounds of the body using a stethoscope. This process provides vital information on the present state of the internal organs, such as the heart and lungs . The stethoscope, which was invented by René Théophile Hyacinth Laennec in 1816, has been used to perform auscultation for several years. The auscultation process is inexpensive, non-invasive, and less time-consuming . Computer-based respiratory sound analysis started to appear in the literature in the early 1980s. This method can assist medical professionals with differential diagnoses, which are used to diagnose the specific disease suffered by a patient or to at least eliminate any imminent life-threatening conditions. The sensors that are most commonly used for computerised respiratory sound recording are microphones, accelerometers, and digital stethoscopes. The types and characteristics of the respiratory sounds that are widely accepted have been reported by Pasterkamp et al. . The normal respiratory sound dominant frequency ranges from 37.5 to 1000 Hz. The dominant frequency of airway obstruction pathology is less than 400 Hz, and the dominant frequency of parenchymal pathology ranges from 200 to 2000 Hz. The duration of airway obstruction pathological conditions, such as wheeze and rhonchi, is greater than 250 ms, whereas the duration of parenchymal pathological conditions, such as crackles, is less than 100 ms. These respiratory sound characteristics provided by Pasterkamp et al. clearly introduced the possibility of discriminating respiratory sounds using signal processing algorithms. However, further studies are required before computerised respiratory sound analysis can be implemented in a clinical setting. In particularly, the development of a robust system requires the implementation of more sophisticated signal processing and machine learning algorithms.
Related works on respiratory sound analysis
Previous studies on computerised respiratory sound analysis have been conducted using various signal processing and machine learning algorithms . This section provides a discussion of the few recent prominent works on computerised respiratory sound analysis. In the study conducted by Güler et al. , normal, wheeze, and crackles respiratory sounds were classified using their power spectral density features. Electret microphones were used to record the respiratory sounds from 129 subjects, and these were then classified using artificial neural networks (ANNs) and genetic algorithm (GA)-based ANNs. The classification accuracies found for ANN and GA-based ANN were 81-91% and 83-93%, respectively. Alsmadi et al.  proposed the use of an autoregressive model for the classification of respiratory sounds. These researchers used an ECM-77B microphone to record the respiratory sounds from 42 subjects and then implemented the k-nearest neighbour algorithm (k-nn) to classify the respiratory sounds. The recognition rate was found to be 96%. Dokur et al.  proposed an incremental supervised neural network for the classification of respiratory sounds. These researchers used ECM-77B Electret microphones to acquire the respiratory sounds from 18 subjects and then extracted the power spectrum features of the respiratory sounds. They then used a grow-and-learn (GAL) network, which is an incremental supervised neural network, for the classification of the respiratory sounds and found that their classification accuracy was promising compared with the previously proposed methods. Sankar et al.  proposed a feedforward neural network for the classification of normal and pathological respiratory sounds based on the following features: energy index, respiration rate, dominant frequency, and strength of the dominant frequency. These researchers used an Electret microphone to record the respiratory sounds from six subjects and obtained a classification accuracy of 98.7%. In the same year, Hashemi et al.  proposed the use of wavelet-based features for the classification of respiratory sounds using a multi-layer perceptron network. These researchers used an electronic stethoscope to record the respiratory sounds from 140 subjects, and their experimental results show that their system can achieve a recognition rate of 89.28%. Flietstra et al.  used support vector machine for the recognition of respiratory sounds. These researchers used an STG 16 lung sound analyser to record the respiratory sounds from 257 subjects and the statistical median feature to train and test the SVM classifier. A mean classification accuracy of 84% was reported.
Even though there are studies in this field dating back to the early 1980s, computerised respiratory sound analysis has not yet been implemented to a level that can be used in a clinical setting. The advancements in signal processing in recent years allow us to use more sophisticated methods for respiratory sound analysis. The literature review revealed that both the feature extraction method and the machine learning algorithm play major roles in the recognition of respiratory sounds. This study compares two different approaches for the recognition of the respiratory pathologies using pulmonary acoustic signals. More specifically, the support vector machine (SVM) and k-nearest neighbour (k-nn) classifiers were implemented for the differentiation of normal, airway obstructions pathology, and parenchymal pathology conditions using the cepstral features obtained from respiratory sounds in the RALE database.
Respiratory sound database
Respiratory sound Pre-processing
Respiratory sound signals are subject to noise, such as heart sound and other artefacts . The RALE database comprises recordings that have been filtered to remove the heart sounds and artefacts. The respiratory sound signals were high-pass filtered at 7.5 Hz to remove the DC offset using a first-order Butterworth filter and low-pass filtered at 2.5 kHz to avoid aliasing using an eight-order Butterworth filter. The sampling rate of the respiratory sounds was 10 kHz .
where Mel(f) is the logarithmic scale of the normal frequency scale f. The logarithmic scale is then converted to time through the use of a discrete cosine transform, and the output is the set of MFCCs. The MFCCs obtained from the respiratory sounds are then used as features in the SVM and k-nn classifiers. In this study, 13 MFCCs were extracted for the classification of the respiratory sounds.
In this study, analysis of variance (ANOVA) was used to test the significance of the feature vector. One-way ANOVA is used to test the null hypothesis of samples with more than two groups. More specifically, one-way ANOVA is used to test the equality of three or more means at one time using the variances .
In this work, two different classifiers were used, namely support vector machine (SVM) and k-nearest neighbour (k-nn). A detailed description of the classifiers used can be found in this section.
Support Vector Machine (SVM)
The SVM classifier is a kernel-based supervised learning algorithm that classifies the data into two or more classes. SVM is particularly designed for binary classification. During the training phase, SVM builds a model, maps the decision boundary for each class, and specifies the hyperplane that separates the different classes. Increasing the distance between the classes by increasing the hyperplane margin helps increase the classification accuracy. SVM can be used to effectively perform non-linear classification. Detailed information on the SVM classifier can be found in [18, 19]. In this study, the MFCC feature vector was fed to the SVM classifier to distinguish normal, airway obstruction, and parenchymal pathological conditions. As mentioned earlier, the SVM classifier is a kernel based classifier. A Kernel function is a mapping procedure done to the training set to improve its resemblance to a linearly separable data set. The purpose of mapping is to increase the dimensionality of the data set and it is done efficiently using a kernel function. Some of the commonly used kernel functions are linear, RBF, quadratic, Multilayer Perceptron kernel, and Polynomial kernel. In this research work, linear and RBF kernel functions were used. The linear kernel function and RBF kernel functions were used due to their dissimilar characteristics. The linear kernel function performs well with linearly separable data set and the RBF kernel function performs well with non-linear data set. The linear kernel function takes less time to train the SVM compared to the RBF kernel function. The linear kernel function is less prone to over fitting compared to the RBF kernel function [20, 21]. The performance of the SVM classifier relies on the choice of the regularization parameter C which is also known as box constraint and the kernel parameter which is also known as the scaling factor. Together they are known as the hyperplane parameter. The value of the box constraint C for the soft margin was set to 1 for both linear and RBF kernel. The scaling factor σ for the RBF kernel was set to 1.
K-Nearest Neighbor (k-nn)
Results and discussion
Performance outcome of the SVM classifier for binary-normalised data
Classification accuracy (%)
86.91 ± 1.47
89.54 ± 0.39
90.13 ± 0.54
91.47 ± 1.66
Performance outcome of the k-nn classifier for binary-normalised data
Classification accuracy (%)
Classification accuracy (%)
94.66 ± 0.56
96.65 ± 0.69
94.81 ± 0.42
96.35 ± 0.93
93.92 ± 0.68
95.06 ± 0.34
94.53 ± 0.47
94.41 ± 0.62
94.25 ± 0.94
95.02 ± 1.21
93.95 ± 1.15
94.41 ± 1.35
92.70 ± 1.27
93.20 ± 1.08
92.57 ± 0.85
94.44 ± 1.28
92.49 ± 0.63
93.32 ± 0.74
91.77 ± 0.77
92.17 ± 1.46
Performance outcome of the SVM Classifier for bipolar-normalised data
Classification accuracy (%)
89.17 ± 1.32
91.36 ± 1.69
89.91 ± 2.39
92.19 ± 1.58
Performance outcome of the k-nn Classifier for bipolar-normalised data
Classification accuracy (%)
Classification accuracy (%)
97.53 ± 0.29
98.26 ± 0.32
97.62 ± 0.58
97.11 ± 0.75
97.56 ± 1.25
97.52 ± 0.97
96. 23 ± 1.36
97.67 ± 0.46
96.92 ± 2.54
97.53 ± 0.48
95.25 ± 1.27
97.82 ± 0.62
95.51 ± 1.58
96.65 ± 0.96
95.88 ± 1.69
96.41 ± 1.19
95.88 ± 0.58
96.65 ± 1.36
95.14 ± 1.69
96.25 ± 0.85
As shown in Table 1, the SVM classifier with the RBF kernel obtained the maximum classification accuracy of 89.54% with a standard deviation of 0.39 for the binary-normalised data with the conventional validation method. Similarly, as shown in Table 2, the k-nn classifier with a k value of 2 gave the maximum classification accuracy of 94.81% with a standard deviation of 0.42 for the binary normalised data with the conventional method. Table 3 shows that the SVM classifier with the RBF kernel gives the maximum classification accuracy of 91.36% with a standard deviation of 1.69 for the bipolar-normalised data with the conventional validation method. Similarly, as shown in Table 4, the k-nn classifier with a k value of 2 obtained the maximum classification accuracy of 97.62% with a standard deviation of 0.58 for the bipolar-normalised data with the conventional validation method.
The data shown in Table 1 reveal that the SVM classifier with the RBF kernel gives the maximum classification accuracy of 91.47% with a standard deviation of 1.66 for the binary-normalised data with the ten-fold cross-validation method. Similarly, Table 2 shows that the k-nn classifier with a k value of 1 gives the maximum classification accuracy of 96.65% with a standard deviation of 0.69 for the binary-normalised data with the ten-fold cross-validation method. Table 3 reveals that the SVM classifier with the RBF kernel obtains the maximum classification accuracy of 92.19% with a standard deviation of 1.58 for the bipolar-normalised with the ten-fold cross-validation method. Similarly, as shown in Table 4, the k-nn classifier with a k value of 1 gives the maximum classification accuracy of 98.26% with a standard deviation of 0.32 for the bipolar-normalised data with the ten-fold cross-validation method. The results obtained demonstrate that the k-nn classifier outperform the SVM classifier in the discrimination of respiratory pathologies. The classification accuracies show that the SVM classifier with the RBF kernel and the ten-fold cross-validation method yields the maximum classification accuracy for the diagnosis of respiratory pathology. Similarly, the k-nn classifier with a k value of 1 and the ten-fold cross-validation method yields the maximum classification accuracy. Both of these machine learning methods achieve the maximum classification accuracy when the data are bipolar normalised.
Confusion matrix for the SVM classifier (Best results: kernel = rbf; normalisation = bipolar)
Confusion matrix for the k-nn classifier (best results: k value = 1; Normalisation = bipolar)
This comparative study shows that the generalisation capability of the k-nn classifier is higher compared with that of the SVM classifier in the diagnosis of respiratory pathologies from the RALE database. However, the computational complexity of the k-nn classifier is high compared to SVM classifier . The results obtained using the k-nn classifier shows that when the k value is less, the classifier performs better. If we have a dataset with n datapoints, then the n-nearest neighbor classifier will always use all datapoints in the dataset to classify new points, since the k-nearest neighbor classifiers uses a majority voting scheme. In view of this when k = 1, only the nearest one data point is chosen. Increasing the k value increases the number of neighbors which may lead to a decrease in performance because the chances of including a data point from a different class becomes higher with the increase of nearest neighbours . The incremental property of the k-nn machine learning algorithm is better than the SVM classifier . This property allows the k-nn classifier to perform better than the SVM classifier in classifying the pulmonary acoustic signals. The pulmonary acoustic signals are non-linear and non-stationary signals . The k-nn classifier is a non-linear classifier and the SVM is both linear and non-linear . When the linear kernel function is used the SVM acts as a linear classifier and when the RBF kernel is used the SVM acts as the non-linear classifier. The classification accuracy of the SVM with linear kernel is low compared to other classifiers because of the non-linear and non-stationary properties of the pulmonary acoustic signals. The limitation of this study is the number of data used. The number of data used in this study is very low and the data collection was carried out in a controlled environment. The analysis of data with respect to clinical settings should be carried out in future with a larger database. The analysis can be further extended to other feature extraction techniques and machine learning algorithms.
This study compared the performance of the SVM and k-nn classifiers for the classification of respiratory pathologies from the RALE lung sound database. To do so, the MFCC features of respiratory sounds obtained from the RALE database were extracted. The extracted feature vectors were analysed through one-way ANOVA and were found to be highly significantly different (p < 0.001). The maximum classification accuracies for the SVM and k-nn classifiers were found to be 92.19% and 98.26%, respectively. The maximum classification accuracy of the SVM classifier was obtained with the RBF kernel, the ten-fold cross-validation method, and bipolar-normalised data. Similarly, the maximum classification accuracy of with the k-nn classifier was obtained for a k value of 1, the ten-fold cross-validation method, and bipolar-normalised data. These findings show that the generalisation capability of the k-nn classifier is higher compared with that of SVM for the classification of respiratory pathologies from the RALE lung sound database.
The authors of this research wish to thank Prof. H. Pasterkamp, and Mr. Chris Carson (PixSoft Inc.) for sharing the RALE Lung sound database.
- Palaniappan R, Sundaraj K, Ahamed NU, Arjunan A, Sundaraj S: Computer-based respiratory sound analysis: a systematic review. IETE Tech Rev. 2013, 30: 248-256. 10.4103/0256-4602.113524.View Article
- Abbas A, Fahim A: An automated computerized auscultation and diagnostic system for pulmonary diseases. J Med Syst. 2010, 34: 1149-1155. 10.1007/s10916-009-9334-1.View ArticlePubMed
- Pasterkamp H, Kraman SS, Wodicika G: Respiratory sounds advances beyond the stethoscope. Am J Respir Crit Care Med. 1997, 156: 974-987. 10.1164/ajrccm.156.3.9701115.View ArticlePubMed
- Palaniappan R, Sundaraj K, Ahamed NU: Machine learning in lung sound analysis: a systematic review. Biocybern Biomed Eng. 2013, 33: 129-135. 10.1016/j.bbe.2013.07.001.View Article
- Güler İ, Polat H, Ergün U: Combining neural network and genetic algorithm for prediction of lung sounds. J Med Syst. 2005, 29: 217-231. 10.1007/s10916-005-5182-9.View ArticlePubMed
- Alsmadi S, Kahya YP: Design of a DSP-based instrument for real-time classification of pulmonary sounds. Comput Biol Med. 2008, 38: 53-61. 10.1016/j.compbiomed.2007.07.001.View ArticlePubMed
- Dokur Z: Respiratory sound classification by using an incremental supervised neural network. Pattern Anal Appl. 2009, 12: 309-319. 10.1007/s10044-008-0125-y.View Article
- Sankar AB, Kumar D, Seethalakshmi K: Neural network based respiratory signal classification using various feed-forward back propagation training algorithms. Eur J Sci Res. 2011, 49: 468-483.
- Hashemi A, Arabalibiek H, Agin K: Classification of wheeze sounds using wavelets and neural networks. International Conference on Biomedical Engineering and Technology. 2011, IACSIT Press: IACSIT Press, 127-131.
- Flietstra B, Markuzon N, Vyshedskiy A, Murphy R: Automated analysis of crackles in patients with interstitial pulmonary fibrosis. Pulm Med. 2011, 2011: 1-7.View Article
- Gross V, Dittmar A, Penzel T, SchÜTtler F, von Wichert P: The relationship between normal lung sounds, age, and gender. Am J Respir Crit Care Med. 2000, 162: 905-909. 10.1164/ajrccm.162.3.9905104.View ArticlePubMed
- Fiz JA, Jane’ R, Lozano M, Go’mez R, Ruiz J: Detecting unilateral phrenic paralysis by acoustic respiratory analysis. PLoS ONE. 2014, 9: e93595-10.1371/journal.pone.0093595.View ArticlePubMed CentralPubMed
- Pasterkamp H: RALE: A computer-assisted instructional package. Respir Care. 1990, 35: 1006-
- Palaniappan R, Sundaraj K, Sundaraj S: Artificial intelligence techniques used in respiratory sound analysis – a systematic review. Biomedizinische Technik/Biomed Eng. 2014, 59: 7-18.
- Bahoura M: Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes. Comput Biol Med. 2009, 39: 824-843. 10.1016/j.compbiomed.2009.06.011.View ArticlePubMed
- Mayorga P, Druzgalski C, Morelos RL, Gonzalez OH, Vidales J: Acoustics based assessment of respiratory diseases using GMM classification. Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2010. 2010, Buenos Aires: IEEE, 6312-6316.View Article
- Mahapoonyanont N, Mahapoonyanont T, Pengkaew N, Kamhangkit R: Power of the test of one-way Anova after transforming with large sample size data. Procedia Soc Behav Sci. 2010, 9: 933-937.View Article
- Tsai C-F, Hsu Y-F, Lin C-Y, Lin W-Y: Intrusion detection by machine learning: a review. Expert Syst Appl. 2009, 36: 11994-12000. 10.1016/j.eswa.2009.05.029.View Article
- Cortes C, Vapnik V: Support-vector networks. Mach Learn. 1995, 20: 273-297.
- Suykens JAK, Vandewalle J: Least squares support vector machine classifiers. Neural Process Lett. 1999, 9: 293-300. 10.1023/A:1018628609742.View Article
- Maji S, Berg AC, Malik J: Classification using intersection kernel support vector machines is efficient. IEEE Conference on Computer Vision and Pattern Recognition. 2008, Anchorage, AK: IEEE, 1-8.
- Hmeidi I, Hawashin B, El-Qawasmeh E: Performance of KNN and SVM classifiers on full word Arabic articles. Adv Eng Inform. 2008, 22: 106-111. 10.1016/j.aei.2007.12.001.View Article
- Pan F, Wang B, Hu X, Perrizo W: Comprehensive vertical sample-based KNN/LSVM classification for gene expression analysis. J Biomed Inform. 2004, 37: 240-248. 10.1016/j.jbi.2004.07.003.View ArticlePubMed
- Quackenbush J: Microarray data normalization and transformation. Nat Gene. 2002, 32: 496-501. 10.1038/ng1032.View Article
- Bhaskar H, Hoyle DC, Singh S: Machine learning in bioinformatics: a brief survey and recommendations for practitioners. Comput Biol Med. 2006, 36: 1104-1125. 10.1016/j.compbiomed.2005.09.002.View ArticlePubMed
- Beyer K, Goldstein J, Ramakrishnan R, Shaft U: When is “nearest neighbor” meaningful?. Database Theory — ICDT’99. Edited by: Beeri C, Buneman P. 1999, London, UK: Springer-Verlag, 1540: 217-235. 10.1007/3-540-49257-7_15.View Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.