From: Optimizing diabetes classification with a machine learning-based framework
Authors | Preprocessing techniques | Models | Accuracy (%) |
---|---|---|---|
Saxena et al. [6] | Feature selection outlier rejection missing value padding | K-nearest neighbor, Random forest | 79.80 |
Krishnamoorthi et al. [7] | Missing value processing, outlier removal, normalization | Logistic regression | 83.00 |
Butt et al. [8] | Various classifiers and models | Random forest, multilayer perceptron, LSTM | 86 |
Garcia-Ordas et al. [13] | Variational self-encoder, sparse self-encoder | Convolutional neural network, sparse self-encoder | 92.31 |
Bukhari et al. [15] | No data preprocessing | Artificial back propagation proportional conjugate gradient neural network (ABP-SCGNN) | 93 |
Gnanadass [18] | Missing data filling (mean) | Naive Bayes, linear regression, random forest, AdaBoost gradient boosting machine, extreme gradient boosting | 78 |
Maniruzzaman et al. [10] | Missing data and outlier handling feature extraction and optimization | Ten different classifiers | 92.26 |
Zou et al. [9] | Dimensionality reduction (PCA, mRMR) | Decision trees, random forests, neural networks | 80.84 |
Hayashi and Yukita [19] | Rule extraction algorithm, sampling selection technique | J48 graft, rule extraction | 83.83 |
Alneamy et al. [20] | TLBO algorithm, hybrid fuzzy wavelet neural network | Functional fuzzy wavelet neural network (FFWNN) | 88.67 |
Maniruzzaman et al. [11] | Gaussian Process-based classification, three kernel functions | Gaussian process, LDA, QDA, NB | 81.97 |
Joshi and Dhakal [12] | Logistic regression, decision tree | Logistic regression, decision tree | 78.26 |
Ejiyi et al. [22] | Data augmentation, attribute analysis missing data imputations | XGBoost, adaboost | 94.67 |
Rahman et al. [16] | Convolutional long short-term memory | Conv-LSTM, CNN, T-LSTM, CNN-LSTM | 91.38 |
Rehman et al. [17] | Handling Miss values, moving average normalization | Deep extreme learning machine (DELM) | 92.80 |