A new fruit fly optimization algorithm enhanced support vector machine for diagnosis of breast cancer based on high-level features

Huang, Hui; Feng, Xi’an; Zhou, Suying; Jiang, Jionghui; Chen, Huiling; Li, Yuping; Li, Chengye

doi:10.1186/s12859-019-2771-z

Volume 20 Supplement 8

Decipher computational analytics in digital health and precision medicine

Research
Open access
Published: 10 June 2019

A new fruit fly optimization algorithm enhanced support vector machine for diagnosis of breast cancer based on high-level features

Hui Huang¹,
Xi’an Feng¹,
Suying Zhou²,
Jionghui Jiang³,
Huiling Chen⁴,
Yuping Li⁵ &
…
Chengye Li⁵

BMC Bioinformatics volume 20, Article number: 290 (2019) Cite this article

5682 Accesses
87 Citations
1 Altmetric
Metrics details

Abstract

Background

It is of great clinical significance to develop an accurate computer aided system to accurately diagnose the breast cancer. In this study, an enhanced machine learning framework is established to diagnose the breast cancer. The core of this framework is to adopt fruit fly optimization algorithm (FOA) enhanced by Levy flight (LF) strategy (LFOA) to optimize two key parameters of support vector machine (SVM) and build LFOA-based SVM (LFOA-SVM) for diagnosing the breast cancer. The high-level features abstracted from the volunteers are utilized to diagnose the breast cancer for the first time.

Results

In order to verify the effectiveness of the proposed method, 10-fold cross-validation method is used to make comparison among the proposed method, FOA-SVM (model based on original FOA), PSO-SVM (model based on original particle swarm optimization), GA-SVM (model based on genetic algorithm), random forest, back propagation neural network and SVM. The main novelty of LFOA-SVM lies in the combination of FOA with LF strategy that enhances the quality for FOA, thus improving the convergence rate of the FOA optimization process as well as the probability of escaping from local optimal solution.

Conclusions

The experimental results demonstrate that the proposed LFOA-SVM method can beat other counterparts in terms of various performance metrics. It can very well distinguish malignant breast cancer from benign ones and assist the doctor with clinical diagnosis.

Background

Breast cancer is the most common cancer and the leading cause of cancer death among females [1]. Early detection and diagnosis is the key to controlling the disease and to improving the survival rate, and pathological diagnosis is the most reliable gold standard of all kinds of methods. Traditional diagnostic methods mostly rely on clinicians’ personal experience and the diagnostic results may be subjectivism with certain probability. In recent years, computational diagnostic tools and artificial intelligence techniques provide automated procedures for objective judgments by making use of quantitative measures and machine learning techniques for medical diagnosis [2,3,4,5,6,7,8,9,10,11]. Similarly, the methods based on artificial intelligence technology for diagnosis of breast cancer have been proposed. Maglogiannis et al. [12] presented using support vector machine (SVM) for diagnosing the breast cancer both on Wisconsin Diagnostic Breast Cancer and the Wisconsin Prognostic Breast Cancer datasets. Kaya et al. [13] proposed a novel approach based on rough set and extreme learning machine for distinguishing the benign or malignant breast cancer. Akay et al. [14] proposed a novel SVM combined with feature selection for breast cancer diagnosis. The experimental results indicate that the proposed method can perform well in terms of accuracy, sensitivity and specificity. Given recent advances on digitized histological studies, it is now possible to use histological tissue patterns with artificial intelligence techniques-aided image analysis to facilitate disease classification [15]. In general, accurate pathological diagnosis of breast cancer depends on features, which are extracted from histopathology images. There are a lot of works for diagnosis of breast cancer based on histopathology images’ features.

Kuse et al. [16] extracted texture features from the cells to train a SVM classier that is used to classify lymphocytes and non-lymphocytes. Dundar et al. [17] proposed to segment cell regions by clustering the pixel data and to identify individual cells by a watershed-based segmentation algorithm, and a proposed MIL approach was used to identify the stage of breast lesion. Sparks et al. [18] presented a CBIR system that leveraged a novel set of explicit shape features which accurately described the similarity between the morphology of objects of interest. Basavanhally et al. [19] presented a novel framework that classifies entire images based on quantitative features extracted from fields of view of varying sizes. In each FOV, cancer nuclei were automatically detected and used to construct graphs (Voronoi Diagram, Delaunay Triangulation, Minimum Spanning Tree). Features describing spatial arrangement of the nuclei were extracted and used to train a boosted classifier that predicts image class for each FOV size.

In all aforementioned works, an objective phenomenon can be found that these studies were usually conducted on the low-level features on image pixels and the high-level ones were discard, which means that these studies may not express prior medical knowledge. Therefore, in this paper, we proposed to diagnose the breast cancer using the high-level features which were defined based on the prior medical knowledge. This definition relies on two very experienced pathologists. Because these features include the experience of doctors, doctors with clinical experience have a high ability to differentiate between breast tumors and breast cancer in general, and have better comprehensibility. We extracted a set of high-level features, including 13 key features, which were the basis for the classification and grading of breast pathology. Based on these features, the pathological data of 470 cases were analyzed by two pathological experts. Then, we proposed a novel learning framework based on SVM for distinguishing malignant breast cancer from the healthy ones. As we all know, the two key parameters in classic SVM are penalty factor and width of kernel function, which traditionally treated by means of grid search and gradient descent. However, these methods are easy to get into local optimal solutions. Recently, some bio-inspired metaheuristic search algorithms (such as genetic algorithms (GA) [20,21,22,23], particle swarm optimization algorithms (PSO) [24,25,26,27], the fruit fly optimization (FOA) [28], moth-flame optimization (MFO) [29]) have made it easier to find the global optimal solution. As a new member of the swarm-intelligence algorithms, FOA [30] is inspired by the foraging behavior of real fruit flies. The FOA has certain outstanding merits, such as a simple computational process, simple implementation, and easy understanding with only a few parameters for tuning. Due to its good properties, FOA has become a useful tool for many real-world problems [10, 28, 31,32,33].

Compared with gradient descent method and grid search method, like other swarm intelligence methods [34, 35], FOA is a global optimization method, which can find the global optimal solution or approximate optimal solution more easily. However, the traditional FOA algorithm has the possibility of falling into the local optimal solution for complex optimization problems, and the convergence rate is not very ideal. Therefore, this paper introduces the Levy flight (LF) strategy to update the positions of fruit flies to further improve its convergence speed, while reducing the probability of FOA falling into the local optimal. LF strategy has been used widely to enhance the lots of metaheuristic algorithms [36,37,38,39,40,41]. The principle of LF strategy can ensure the diversity of algorithms in the process of optimization [42,43,44] and improve the convergence rate. In this study, the improved FOA method, LFOA, was utilized to optimize the two key parameters pair including penalty factor and width of kernel function in SVM method and obtain the optimal model (LFOA-SVM). Furthermore, this model will be investigated to diagnose the breast cancer on high-level features dataset. As far as we know, this paper is the first to solve the parameter optimization problem of SVM with LFOA. In the experiment, a 10-fold cross-validation method was used on data to make detailed comparison between LFOA--SVM, FOA-SVM (model based on the primitive fruit fly optimization model), GA-SVM (model based on genetic algorithms), PSO-SVM (model based on particle swarm optimization algorithms), random forest (RF), back propagation neural network (BPNN) and SVM. The experimental results demonstrated that the proposed LFOA-SVM was superior to other methods in terms of classification accuracy, Mathews correlation coefficient (MCC), sensitivity and specificity.

The rest of this paper is organized as follows. In Preliminaries Section background information used in the study was introduced. In Methods Section the detailed implementation of the proposed method was presented. In Results and discussion Section, experimental designs, results and discussion were delivered. Finally, in Conclusion Section the conclusions and recommendations for future work were summarized.

Preliminaries

Support vector machine

Support Vector Machine (SVM) [45] is a supervised learning model and related learning algorithm for analyzing data in classification and regression analysis. Given a set of training instances, each training instance is marked as one or the other of two classes, the SVM training algorithm creates a model that assigns a new instance to one of two classes, making it a non-probabilistic binary linear classifier.

The SVM model is to represent instances as points in space, so that the mapping allows instances of separate categories to be separated by as wide and distinct intervals as possible. Then, new instances are mapped to the same space and the category is predicted based on which side they fall in the interval. In addition to linear classification, SVM can also use the so-called kernel technique to effectively perform nonlinear classification, mapping its input implicitly into the high-dimensional feature space.

More formally, support vector machines construct hyperplanes in high-dimensional or infinite-dimensional spaces. Which can be used for classification, regression or other tasks. Intuitively, the farther away the nearest training data point is, the better, because this can reduce the generalization error of the classifier.

Fruit-Fly optimization algorithms

The fruit fly optimization algorithm (FOA) [30] was a meta-heuristic algorithm which is inspired by the foraging behavior of fruit fly. Fruit fly relies on vision and smell to position food during foraging. FOA searches for solution space by mimicking the way of fruit fly flight when solving optimization problems. In FOA, first, the fruit fly population (candidate solution) is randomly generated in the solution space, and then each fruit fly will update its position according to the flight mode of the fruit fly. Fruit fly population continuously improves the fitness of the population (quality of solution) during the iterative process.

Levy flight

Levy flight (LF) mechanism is often used to improve meta-heuristics because its characteristics are similar to the movement of many animals in nature. The phenomena is called Levy statistics [46]. The LF is essentially stochastic non-Gaussian walks. Its step value is dispersed relative to Levy stable distribution. Levy distribution can be represented as the following equation:

$$ Levy(s)\sim {\left|s\right|}^{-1-\beta },0<\beta \le 2 $$

(1)

β represents an important Levy index to adjust the stability, s is the step length.

Methods

Levy flight enhanced FOA (LFOA)

Levy’s flight is characterized by short steps and random directions. This feature can effectively avoid the whole population falling into local optimum, thus enhancing the global detection ability of the algorithm. In this paper, we have introduced the LF strategy into to FOA to explore the search space more efficiently. The new position is updated according to the following rule.

$$ {X}_i^{levy}={X}_i+{X}_i\oplus levy(s) $$

(2)

where $ {X}_i^{levy} $ is the new position of the ith search agent X_i after updating.

Proposed LFOA-SVM model

This study proposes a novel evolutionary SVM that employs the LFOA strategy, and the resultant LFOA-SVM model can adaptively determine the two key hyper-parameters for SVM. The general framework of the proposed method is demonstrated in Fig. 1. The proposed model is primarily comprised of two procedures: the inner parameter optimization and the outer classification performance evaluation. During the inner parameter optimization procedure, the SVM parameters are dynamically adjusted by the LFOA technique via the 5-fold cross validation (CV) analysis. Then, the obtained optimal parameters are fed to the SVM prediction model to perform the classification task for breast cancer diagnosis in the outer loop using the 10-fold CV analysis. The classification accuracy was used as the fitness function.

$$ fitness=\left({\sum}_{i=1}^K AC{C}_i\right)/k $$

(3)

where ACC_i represents the average accuracy achieved by the SVM classifier via 5-fold CV.

The main steps conducted by the LFOA-SVM are described in detail as follows:

Step 1: Initialize the input parameters for LFOA, include population size, maximum number of iterations, upper bound of the variables, and lower bound of the variables, the dimension of the problem.
Step 2: Randomly generated the position of the fruit fly swarm based on the upper and lower bounds of the variables.
Step 3: Generate initial population for LFOA based on the position of the fruit fly swarm.
Step 4: Evaluate the fitness of all fruit flies in population by SVM with the position of fruit fly as parameters.
Step 5: Take the position of the best fruit fly as the position of the fruit fly swarm (global optimum).
Step 6: Update the position of each fruit fly in the swarm with Levy-flight mechanism and evaluate the fitness of the fruit fly.
Step 7: Update global optimum if the fitness of the best individual in the fruit fly population is better than the global optimum.
Step 8: Update iteration t, t = t + 1. If t larger than maximum number of iterations, go to step 6.
Step 9: Return the global optimum as the optimal SVM parameter pair (C, γ).

Results and discussion

Data description

The data were collected from Wenzhou people’s Hospital from 2004 to 2015. Four hundred seventy objects have been selected as the research objects. There are 232 benign cases and 238 malignant cases. Based on the prior medical knowledge of the classification and grading of breast pathology, we proposed a set of features descriptor with the help of two well-experienced pathologist from Wenzhou people’s hospital of China. A total of 14 key features were included and quantified in this study. Table 1 gives the brief description and quantization of these features.

Table 1 The brief descriptions and quantization of features used in this study

Full size table

Experimental setup

The LFOA-SVM, FOA-SVM, PSO-SVM, GA-SVM, RF, BPNN and ELM classification models were implemented using the MATLAB platform. For SVM, the LIBSVM implementation was utilized, which was originally developed by Chang and Lin [47]. For RF, the code package from https://code.google.com/archive/p/randomforest-matlab/ was adopted. We implemented the LFOA, FOA, GA and PSO from scratch. The computational analysis was conducted on a Windows Server 2008 operating system with Intel Xeon CPU E5–2650 v3(2.30 GHz) and 16GB of RAM.

In order to conduct an accurate comparison, the same number of generations and the same population swarm size were used for FOA, PSO, and GA. According to the preliminary experiment, when the number of generations and the swarm size are set to 250 and 8, respectively, the involved methods produce a satisfactory classification performance. For the metaheuristic methods, the same searching range of the parameters C∈[2^− 5, 2¹⁵] and γ∈ [2^–15, 2] was used. The parameter settings for relevant algorithms are shown in Table 2.

Table 2 The parameter settings for the relevant methods

Full size table

The k-fold CV [48] was used to evaluate the classification performance of the model. A nested stratified 10-fold CV was used for the purposes of this study [49]. To evaluate the proposed method, commonly used evaluation criteria such as classification accuracy (ACC), sensitivity, specificity and Matthews Correlation Coefficients (MCC) were analyzed.

Benchmark function verification

To verify the performance of the proposed method LFOA, we use a common set of 23 benchmark functions, including unimodal, multimodal, and fixed-dimension multimodal. The formulas and brief descriptions of these functions can be seen in Tables 3, 4 and 5.

Table 3 Unimodal benchmark functions

Full size table

Table 4 Multimodal benchmark functions

Full size table

Table 5 Fixed-dimension multimodal benchmark functions

Full size table

Moreover, the performance of the LFOA is also compared with the original FOA, MFO, BA, DA, FPA, PSO, and SCA. The relevant parameter settings for the algorithm mentioned above for comparison refer to the previous papers, and as shown in Table 2, specific parameter values have been listed. In order to obtain more accurate experimental results, 30 independent experiments are performed on each test function, and the average value is calculated as the final result of each algorithm. The number of iterations and population size of the algorithm are set to 500 and 30, respectively. The results obtained are reported in Table 6 and Fig. 2. The average (Avg.), standard deviation (Std.) and rankings of the different algorithms in solving the f₁-f₂₃ test functions are displayed in Table 6.

Table 6 Results of testing benchmark functions

Full size table

As shown in Table 6, on the seven unimodal functions, according to the results of the improved LFOA and other algorithms, it can be clearly seen that except for the function f₇, the results achieved on f₁-f₆ is better than the original FOA and the other six algorithms. For f₇, the FOA performs well for 30-dimension problem. For six multimodal functions, the LFOA method surpasses the other competitors on f₉-f₁₃. From the results for f₈, although our improved algorithm LFOA could not search much better solutions, there is no doubt that LFOA is still very competitive compared to the original FOA. For ten fixed-dimension multimodal functions, LFOA has attained the exact optimal solutions for 30-dimension problem f₁₅. For other nine functions (f₁₄ and f₁₆-f₂₃), although in dealing with some problems the improved LFOA is not better than other methods, it is observed that the optimization effect of proposed LFOA is still improved compared with the original FOA. Moreover, based on rankings, the LFOA is the best overall technique and the overall ranks show that FOA, FPA, BA, SCA, MFO, DA, PSO algorithms are in the next places, respectively.

The convergence trends of LFOA and other methods for different test functions (f₁, f₂, f₃, f₄, f₁₀, f₁₁, f₁₂ and f₁₃) are depicted in Figs. 2 and 3

. From f₁, it can be clearly seen that LFOA can take the lead in the initial stage and jump out of the local optimal solution compared with the other seven algorithms. From f₂, the improved LFOA can reveal a fast convergence behavior and finally achieved the best solution. It is shown that the LFOA algorithm has the fastest convergence speed initially when using f₃. It can be found that f₄ and f₁ have the same convergence phenomenon. From f₁₀ and f₁₁, the proposed LFOA shows a faster convergence rate in the early stages, but other algorithms are all trapped in local optima due to the weaker search capability. From f₁₂, f₁₃, the original FOA and the improved LFOA have a very fast convergence speed in the early stage, but the difference between FOA and LFOA is that FOA failed to escape from the local optimal solution in the later stage. From Figs. 2 and 3, we can conclude that the proposed algorithm not only has prominent advantages over other algorithms, but also converges very fast on most problems.

In summary, from Table 6 and Figs. 2 and 3, it can be seen that the improved LFOA has outstanding search advantages and faster optimization convergence than other counterparts.

Results on the breast cancer diagnosis

In this section, the performance of the proposed model in the diagnosis of breast cancer has been thoroughly tested and analyzed. Table 7 shows the detailed results obtained by the LFOA-SVM model in the experiment. On average, the model achieves a classification accuracy of 93.83%, sensitivity of 91.22%, specificity of 96.53% and MCC of 0.8799.

Table 7 Classification performance of LFOA-SVM

Full size table

The proposed model and other six machine learning models including FOA-SVM, GA-SVM, PSO-SVM, RF, BP and ELM were tested simultaneously on the breast cancer dataset and the results are shown in Fig. 4. The figure reveals that the LFOA-SVM model is better than the FOA-SVM model in four evaluation metrics because compared with FOA-SVM, the ACC of LFOA-SVM is not only higher, but also the standard deviation is much smaller. On the ACC metric, the LFOA-SVM model obtained the best results. The results obtained by FOA-SVM and PSO model are very close behind the LFOA-SVM model, followed by RF, GA-SVM and ELM. The BP model has the worst result. On the Sensitivity metric, the PSO-SVM model obtains the best results. LFOA-SVM achieved the second place, followed by RF, BP, FOA-SVM and GA-SVM. The result obtained by ELM is the worst. On the Specificity metric, LFOA-SVM model obtained the best results. ELM achieved the second place. The results obtained by FOA-SVM and PSO model are very close behind the ELM, followed by GA-SVM and RF, the result obtained by GA -SVM and RF are very similar. The result obtained by BP is the worst. On the MCC metric, the LFOA-SVM model still obtains the best results. The PSO-SVM is in the next place, followed by FOA-SVM, RF, GA-SVM and ELM. The result obtained by BP is the worst.

For comparison purpose, we have also recoded the detailed results of the confusion matrix for LFOA-SVM and FOA-SVM. As shown in Table 8, we can see that LFOA-SVM correctly identifies 216 malignant tumors and 225 benign tumors, and misjudges 22 malignant tumors as benign tumors and 7 benign tumors as malignant tumors. FOA-SVM correctly identifies 215 malignant tumors and 220 benign tumors, misjudges 23 malignant tumors as benign tumors and 12 benign tumors as malignant tumors. The results indicate that LFOA is superior to FOA in the recognition of malignant tumors and benign tumors.

Table 8 Confusion matrix obtained by the proposed LFOA-SVM and FOA-SVM

Full size table

In order to comprehensively evaluate the performance of the model, the convergence curve of the model based on the meta-heuristic algorithms in the training process is also compared and analyzed. The convergence curves of the four models are presented in Fig. 5. As shown, LFOA-SVM model not only has a very fast convergence speed but also achieves the highest classification accuracy. However, FOA-SVM model has a slow convergence speed. The main reason is that LF mechanism can improve the global search ability of FOA. Inspecting the curves in Fig. 5, The FOA-SVM model needs more iterations to converge and the obtained solution is not better than that of LFOA-SVM model. The GA-SVM model converges after a few iterations, which reveals the GA has a weak global search capability, it takes a long time to jump out of the local optimum, and the final result is not satisfactory.

Discussions

In this study, a new support vector machine model (LFOA-SVM) based on LF strategy enhanced FOA is proposed to diagnose the breast cancer. The main novelty lies in the improved FOA strategy (LFOA) was proposed for the first time and applied to predicting the breast cancer from the perspective of the high-level features as well. Compared with the original FOA and other optimizers, LFOA can achieve the better solution and has a faster convergence speed as well. LFOA has aided SVM to achieve much more suitable parameters for learning and thus get the higher prediction performance for breast cancer diagnosis. The experimental results have demonstrated that the LFOA-SVM model has achieved better performance than the other competitive counterparts.

The main contributions of this study are as follows:

a)
First, in order to fully explore the potential of the SVM classifier, we introduce a levy flight strategy-enhanced FOA to adaptively determine the two key parameters of SVM, which aided the SVM classifier in more efficiently achieving the maximum classification performance.
b)
The resulting model, LFOA-SVM, is applied to serve as a computer-aided decision-making tool for diagnosing the breast cancer from high-level features for the first time.
c)
The proposed LFOA-SVM method achieves superior results and offers more stable and robust results when compared to the other SVM models.

Conclusions

This paper has developed an effective LFOA-SVM method which can well diagnose the breast cancer in clinical diagnosis and provide doctors with meaningful clinical decision. The proposed method has achieved a classification accuracy of 93.83%, sensitivity of 91.22%, specificity of 96.53% and MCC of 0.8799 for breast cancer diagnosis based on the high-level features.

Improving the LFOA method via introducing the mechanisms such as mutation strategy or the opposition-based learning strategy is our future research direction. In addition, we will plan to apply the method to other related disease diagnosis problems.

Abbreviations

ACC:: Classification accuracy
Avg:: Average
BA:: Bat algorithm
BPNN:: Back propagation neural network
CV:: Cross validation
DA:: Dragon fly algorithm
ELM:: Extreme learning machine
FOA:: Fruit fly optimization algorithm
FOA-SVM:: SVM model based on original FOA
FPA:: Flower pollination algorithm
LF:: Levy flight
LFOA:: FOA enhanced by LF strategy
LFOA-SVM:: SVM model based on original FOA
MCC:: Mathews correlation coefficient
MFO:: Moth-flame optimization
PSO-SVM:: SVM model based on original PSO
RF:: Random forest
SCA:: Sine cosine algorithm
Std:: Standard deviation
SVM:: Support vector machine

References

Msph LAT, Bray F, Siegel RL, Jacques Ferlay ME, Lortet-Tieulent J, PhD AJD. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):69–90.
Google Scholar
Li Q, Chen H, Huang H, Zhao X, Cai Z, Tong C, Liu W, Tian X. An enhanced Grey wolf optimization based feature selection wrapped kernel extreme learning machine for medical diagnosis. Comput Math Methods Med. 2017;2017:9512741.
PubMed PubMed Central Google Scholar
Ma C, Ouyang J, Chen HL, Zhao XH. An efficient diagnosis system for Parkinson's disease using kernel-based extreme learning machine with subtractive clustering features weighting approach. Comput Math Methods Med. 2014;2014(3):985789.
PubMed PubMed Central Google Scholar
Chen H-L, Yang B, Liu J, Liu D-Y. A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst Appl. 2011;38(7):9014–22.
Article Google Scholar
Wang M, Chen H, Yang B, Zhao X, Hu L, Cai Z, Huang H, Tong C. Toward an optimal kernel extreme learning machine using a chaotic moth-flame optimization strategy with applications in medical diagnoses. Neurocomputing. 2017;267(Supplement C):69–84.
Article Google Scholar
Zhao X, Zhang X, Cai Z, Tian X, Wang X, Huang Y, Chen H, Hu L. Chaos enhanced grey wolf optimization wrapped ELM for diagnosis of paraquat-poisoned patients. Comput Biol Chem. 2018. https://doi.org/10.1016/j.compbiolchem.2018.11.017.
Article CAS PubMed Google Scholar
Zhu J, Zhao X, Li H, Chen H, Wu G. An effective machine learning approach for identifying the glyphosate poisoning status in rats using blood routine test. IEEE Access. 2018;6:15653–62.
Article Google Scholar
Zhu J, Zhu F, Huang S, Chen H, Zhao X, Zhang S. A new evolutionary machine learning approach to identify the pyrene induced rat hepatotoxicity and renal dysfunction. IEEE Access. 2018. https://doi.org/10.1109/ACCESS.2018.2889151.
Article Google Scholar
Xu J, Zhang X, Chen H, Li J, Zhang J, Shao L, Wang G. Automatic analysis of microaneurysms turnover to diagnose the progression of diabetic retinopathy. IEEE Access. 2018;6:9632–42.
Article Google Scholar
Wang X, Wang Z, Weng J, Wen C, Chen H, Wang X. A new effective machine learning framework for Sepsis diagnosis. IEEE Access. 2018;6:48300–10.
Article Google Scholar
Cai Z, Gu J, Wen C, Zhao D, Huang C, Huang H, Tong C, Li J, Chen H. An intelligent Parkinsons’ disease diagnostic system based on a chaotic bacterial foraging optimization enhanced fuzzy KNN approach. Comput Math Methods Med. 2018;2018:24.
Article Google Scholar
Maglogiannis I, Zafiropoulos E, Anagnostopoulos I. An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl Intell. 2009;30(1):24–36.
Article Google Scholar
Kaya Y. A new intelligent classifier for breast cancer diagnosis based on rough set and extreme learning machine: RS+ELM. Turk J Electr Eng Comput Sci. 2014;21(Sup.1):2079–91.
Google Scholar
Akay MF. Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst Appl. 2009;36(2):3240–7.
Article Google Scholar
Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B. Histopathological image analysis: a review. IEEE Rev Biomed Eng. 2009;2:147–71.
Article PubMed PubMed Central Google Scholar
Kuse M, Sharma T, Gupta S. A classification scheme for lymphocyte segmentation in H&E stained histology images. Berlin Heidelberg: Springer; 2010.
Book Google Scholar
Dundar MM, Badve S, Bilgin G, Raykar V, Jain R, Sertel O, Gurcan MN. Computerized classification of Intraductal breast lesions using histopathological images. IEEE Trans Biomed Eng. 2011;58(7):1977–84.
Article PubMed PubMed Central Google Scholar
Sparks R, Madabhushi A. Content-based image retrieval utilizing explicit shape descriptors: applications to breast MRI and prostate histopathology. Proc SPIE. 2011;7962(8):765–8.
Google Scholar
Basavanhally A, Ganesan S, Shih N, Mies C, Feldman M, Tomaszewski J, Madabhushi A. A boosted classifier for integrating multiple fields of view: breast cancer grading in histopathology. In: IEEE International Symposium on Biomedical Imaging: From Nano To Macro; 2011. p. 125–8.
Chapter Google Scholar
Guo T, Han L, He L, Yang X. A GA-based feature selection and parameter optimization for linear support higher-order tensor machine. Neurocomputing. 2014;144:408–16.
Article Google Scholar
Urraca R, Sodupe-Ortega E, Antonanzas J, Antonanzas-Torres F, Martinez-de-Pison FJ. Evaluation of a novel GA-based methodology for model structure selection: the GA-PARSIMONY. Neurocomputing. 2018;271:9–17.
Article Google Scholar
Min SH, Lee J, Han I. Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Syst Appl. 2006;31(3):652–60.
Article Google Scholar
Huang CL, Wang CJ. A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl. 2006;31(2):231–40.
Article Google Scholar
Hu L, Lin F, Li H, Tong C, Pan Z, Li J, Chen H. An intelligent prognostic system for analyzing patients with paraquat poisoning using arterial blood gas indexes. J Pharmacol Toxicol Methods. 2017;84:78–85.
Article CAS PubMed Google Scholar
ling Chen H, Yang B, jing Wang S, Wang G, zhong Li H, bin Liu W. Towards an optimal support vector machine classifier using a parallel particle swarm optimization strategy. Appl Math Comput. 2014;239:180–97.
Google Scholar
Chen HL, Yang B, Wang G, Liu J, Chen YD, Liu DY. A three-stage expert system based on support vector machines for thyroid disease diagnosis. J Med Syst. 2012;36(3):1953–63.
Article PubMed Google Scholar
Deng W, Yao R, Zhao H, Yang X, Li G. A novel intelligent diagnosis method using optimal LS-SVM with improved PSO algorithm. Soft Comput. 2017. https://doi.org/10.1007/s00500-017-2940-9.
Article Google Scholar
Shen L, Chen H, Yu Z, Kang W, Zhang B, Li H, Yang B, Liu D. Evolving support vector machines using fruit fly optimization for medical data classification. Knowl-Based Syst. 2016;96:61–75.
Article Google Scholar
Li C, Hou L, Sharma B, Li H, Chen C, Li Y, Zhao X, Huang H, Cai Z, Chen H. Developing a new intelligent system for the diagnosis of tuberculous pleural effusion. Comput Methods Prog Biomed. 2018;(153):211–25.
Article PubMed Google Scholar
Pan WT. A new fruit Fly optimization algorithm: taking the financial distress model as an example. Knowl-Based Syst. 2012;26(2):69–74.
Article Google Scholar
Li H, Guo S, Zhao H, Su C, Wang B. Annual electric load forecasting by a least squares support vector machine with a fruit Fly optimization algorithm. Energies. 2012;5(11):4430–45.
Article Google Scholar
Wang L, Zheng XL, Wang SY. A novel binary fruit fly optimization algorithm for solving the multidimensional knapsack problem. Knowl-Based Syst. 2013;48(2):17–23.
Article CAS Google Scholar
Pan QK, Sang HY, Duan JH, Gao L. An improved fruit fly optimization algorithm for continuous function optimization problems. Knowl-Based Syst. 2014;62(5):69–83.
Article Google Scholar
Deng W, Zhao H, Zou L, Li G, Yang X, Wu D. A novel collaborative optimization algorithm in solving complex optimization problems. Soft Comput. 2017;21(15):4387–98.
Article Google Scholar
Deng W, Zhao H, Yang X, Xiong J, Sun M, Li B. Study on an improved adaptive PSO algorithm for solving multi-objective gate assignment. Appl Soft Comput J. 2017;59:288–302.
Article Google Scholar
Ali MZ, Awad NH, Reynolds RG, Suganthan PN. A balanced fuzzy cultural algorithm with a modified levy flight search for real parameter optimization. Inf Sci. 2018;447:12–35.
Article Google Scholar
Guerrero M, Castillo O, García M. Cuckoo search via lévy flights and a comparison with genetic algorithms. In: Studies in computational intelligence, vol. 574; 2015. p. 91–103.
Google Scholar
Heidari AA, Pahlavani P. An efficient modified grey wolf optimizer with Lévy flight for optimization tasks. Appl Soft Comput J. 2017;60:115–34.
Article Google Scholar
Jensi R, Jiji GW. An enhanced particle swarm optimization with levy flight for global optimization. Appl Soft Comput J. 2016;43:248–61.
Article Google Scholar
Li R, Wang Y. Improved particle swarm optimization based on Lévy flights. Xitong Fangzhen Xuebao / J Syst Simul. 2017;29(8):1685–1691 and 1701.
Google Scholar
Luo J, Chen H, zhang Q, Xu Y, Huang H, Zhao X. An improved grasshopper optimization algorithm with application to financial stress prediction. Appl Math Model. 2018;64:654–68.
Article Google Scholar
Pavlyukevich I. Lévy flights, non-local search and simulated annealing. J Comput Phys. 2007;226(2):1830–44.
Article CAS Google Scholar
Sharma H, Bansal JC, Arya KV, Yang XS. Lévy flight artificial bee colony algorithm. Int J Syst Sci. 2016;47(11):2652–70.
Article Google Scholar
Tang D, Yang J, Dong S, Liu Z. A lévy flight-based shuffled frog-leaping algorithm and its applications for continuous optimization problems. Appl Soft Comput J. 2016;49:641–62.
Article Google Scholar
Cortes C, Vapnik V. Support-vector networks, Machine Learning. 1995;20(3):273–97.
Google Scholar
Yang XS, Deb S. Cuckoo search via Lévy flights. In: 2009 world congress on nature and biologically inspired computing, NABIC 2009 - proceedings; 2009. p. 210–4.
Chapter Google Scholar
Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST). 2011;2(3):27.
Google Scholar
Salzberg SL. On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min Knowl Disc. 1997;1(3):317–28.
Article Google Scholar
Statnikov A, Tsamardinos I, Dosbayev Y, Aliferis CF. GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data. Int J Med Inform. 2005;74(7–8):491–503.
Article PubMed Google Scholar

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their suggestions that contributed to improve our paper.

Funding

This research is supported by the National Natural Science Foundation of China (NSFC) (61702376). This research is also funded by the Medical and Health Technology Projects of Zhejiang province (2019315504), Zhejiang Provincial Natural Science Foundation of China (LY17F020012, LY15F020033), the Wenzhou Special Science and Technology Project (ZG2017019, Y20170043).

Availability of data and materials

The data used to support the findings of this study are available from the corresponding author upon request.

About this supplement

This article has been published as part of BMC Bioinformatics Volume 20 Supplement 8, 2019: Decipher computational analytics in digital health and precision medicine. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-20-supplement-8.

Author information

Authors and Affiliations

School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, 710072, China
Hui Huang & Xi’an Feng
Pathology Department of Wenzhou People’s Hospital, Wenzhou, 325035, China
Suying Zhou
Zhijiang College of Zhejiang University of Technology, Hangzhou, 310024, China
Jionghui Jiang
Department of Computer Science, Wenzhou University, Wenzhou, 325035, China
Huiling Chen
Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
Yuping Li & Chengye Li

Authors

Hui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xi’an Feng
View author publications
You can also search for this author in PubMed Google Scholar
Suying Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jionghui Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Huiling Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuping Li
View author publications
You can also search for this author in PubMed Google Scholar
Chengye Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HH, HC, CL conceived and designed the experiments. HH and XF performed the experiments. HC, SZ, YL, HH analyzed the data. HH, HC, YL, and CL contributed reagents, materials, and/or analysis tools. HH, HC, JJ, and CL wrote the paper. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Huiling Chen or Chengye Li.

Ethics declarations

Ethics approval and consent to participate

The human body data involved in this paper have been approved by the ethics committee of Wenzhou People’s Hospital.

Consent for publication

The data we use in this statement have been agreed by patients and doctors, and we have not published it anywhere else. All the authors confirmed and checked and agreed to publish the paper.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Huang, H., Feng, X., Zhou, S. et al. A new fruit fly optimization algorithm enhanced support vector machine for diagnosis of breast cancer based on high-level features. BMC Bioinformatics 20 (Suppl 8), 290 (2019). https://doi.org/10.1186/s12859-019-2771-z

Download citation

Published: 10 June 2019
DOI: https://doi.org/10.1186/s12859-019-2771-z

Decipher computational analytics in digital health and precision medicine

A new fruit fly optimization algorithm enhanced support vector machine for diagnosis of breast cancer based on high-level features

Abstract

Background

Results

Conclusions

Background

Preliminaries

Support vector machine

Fruit-Fly optimization algorithms

Levy flight

Methods

Levy flight enhanced FOA (LFOA)

Proposed LFOA-SVM model

Results and discussion

Data description

Experimental setup

Benchmark function verification

Results on the breast cancer diagnosis

Discussions

Conclusions

Abbreviations

References

Acknowledgements

Funding

Availability of data and materials

About this supplement

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us