Skip to main content

Table 1 Selective summary of variable selection methods with types of regularizers, main regularization parameters and computational efficiency. Here we focus on the main regularization parameters of the different methods, but there are often several additional hyper-parameters

From: Randomized boosting with multivariable base-learners for high-dimensional variable selection and prediction

Method Regularizer (parameters) Comments on computational efficiency
Explicit regularization
Information criteria (4), e.g. AIC [24], BIC [25], EBIC [26] \(\ell _0\)-penalty (\(\lambda\)) Best subset selection not efficient for high-dimensional problems. Heuristic optimization [27,28,29] or mixed-integer optimization [30] can be used.
Lasso [1] \(\ell _1\)-penalty (\(\lambda\)) Computationally efficient convex relaxation of \(\ell _0\)-type problem.
Relaxed lasso [2, 3] \(\ell _1\)-penalty (\(\lambda , \gamma\)) Combination of \(\ell _1\)-regularized and unregularized (restricted least squares) estimator. Computationally efficient but tuning more costly than for lasso.
Elastic net [4] \(\ell_1\)-/\(\ell_2\)-penalty (\(\lambda , \alpha\)) Combination of \(\ell _1\)- and \(\ell _2\)-penalties. Computationally efficient but tuning more costly than for lasso.
Implicit regularization
\(L_2\)Boosting [10] (Algorithm 1) Early stopping (\(m_{{\text {stop}}}\)) Tuning of stopping iteration \(m_{{\text {stop}}}\) via resampling leads to implicit regularization.
Twin boosting [31] Early stopping (\(m_1,m_2\)) Two-stage approach using \(L_2\)Boosting estimates as weights in second stage of \(L_2\)Boosting. Tuning more costly than for single-stage \(L_2\)Boosting.
Stability selection [18,19,20] Flexible (PFER) Computationally intensive ensemble approach, applying e.g. lasso or \(L_2\)Boosting multiple times on subsamples. Provides control over false positives (PFER).
New Subspace Boosting:
SubBoost (Algorithm 2),
RSubBoost and AdaSubBoost (Algorithm 3)
Automatic stopping (\(\Phi\)) Multivariable base-learners with double checking via selection criterion \(\Phi\) for automatic stopping. Randomized preselection of base-learners for scalability. For further hyper-parameters see Algorithms 2 and 3 and the Additional file 1 for their effects on the computational efficiency.