Skip to main content

Table 5 Prediction accuracy in the UCI machine learning benchmark data

From: Random generalized linear model: a highly accurate and interpretable ensemble predictor

Data set RGLM RGLM.inter2 RF RFbigmtry Rpart LDA DLDA KNN SVM SC
BreastCancer 0.964 0.959 0.969 0.961 0.941 0.957 0.959 0.966 0.967 0.956
HouseVotes84 0.961 0.963 0.958 0.954 0.954 0.951 0.914 0.924 0.958 0.938
Ionosphere 0.883 0.946 0.932 0.917 0.875 0.863 0.809 0.849 0.940 0.829
diabetes 0.768 0.759 0.759 0.754 0.741 0.768 0.732 0.740 0.757 0.743
Sonar 0.769 0.837 0.817 0.788 0.707 0.726 0.697 0.812 0.822 0.726
ringnorm 0.577 0.973 0.940 0.910 0.770 0.567 0.570 0.590 0.977 0.535
threenorm 0.803 0.827 0.807 0.777 0.653 0.817 0.825 0.815 0.853 0.817
twonorm 0.937 0.953 0.947 0.920 0.733 0.957 0.960 0.947 0.953 0.960
Glass 0.636 0.743 0.827 0.799 0.729 0.659 0.531 0.808 0.748 0.645
Satellite 0.986 0.987 0.988 0.985 0.961 0.985 0.734 0.990 0.988 0.803
Vehicle 0.965 0.986 0.986 0.973 0.944 0.967 0.729 0.909 0.974 0.752
Vowel 0.936 0.986 0.983 0.976 0.950 0.938 0.853 0.999 0.991 0.909
MeanAccuracy 0.849 0.910 0.909 0.893 0.830 0.846 0.776 0.862 0.911 0.801
Rank 6 2 2 4 8 7 10 5 2 9
Pvalue 0.0093 NA 0.26 0.042 0.00049 0.0093 0.0067 0.11 0.96 0.0015
  1. For each data set, the prediction accuracy was estimated using 3−f o l d cross validation across 100 random partitions of the data into 3 folds. RGLM.inter2 incorporates pairwise interaction between features into the RGLM predictor. Mean accuracies and the resulting ranks are summarized at the bottom. The Wilcoxon signed rank test was used to test whether accuracy differences between RGLM.inter2 and other predictors are significant. RGLM.inter2, RF, and SVM tie for first place (resulting in a rank of 2 for each method).