Skip to main content

Table 2 Summary of the modeling approaches included in the evaluation

From: A systematic evaluation of high-dimensional, ensemble-based regression for exploring large model spaces in microbiome analyses

Model Ensemble characteristics Output Paradigm R Package
  Tuning parameter Model space construction    
ENC λ ENC None Influential variables \(\left.\vphantom {\frac {\frac {\sum \limits ^{\frac {\sum \limits ^{2}_{3}} {1}}_{1}}{\sum \limits ^{2}_{3}}}{1_{3}^{4}}}\right \}\text {Frequentist}\ (l_{1}, l_{2}\ \text {penalties})\) quadrupen, glmnet
PS λ MB \(\left.\vphantom {\frac {\sum \limits ^{2}_{3}}{1_{1}^{\frac {1}{2}}}}\text {Subsampling}\right \}\) Influential variables   
LS λ ENC   Inclusion probabilities   
SS Λ   Inclusion probabilities   
PR λ MB \(\left.\vphantom {\frac {\sum \limits ^{2}_{3}}{1_{1}^{\frac {1}{2}}}}\right \}\text {Resampling}\) Influential variables   
LR λ ENC   Inclusion probabilities   
SR Λ   Inclusion probabilities   
BMA E M S=1 \(\left.\vphantom {\frac {\sum \limits ^{2}}{1_{3}}}\right \}\text {MCMC}\) Inclusion probabilities \(\left.\vphantom {\frac {\sum \limits ^{2}}{1_{3}}}\right \}\text {Bayesian (Spike \& slab prior)}\) BoomSpikeSlab
BMAC E M S CV   Inclusion probabilities   
  1. ENC: The baseline penalized regression model. Elastic net with λ optimal =λ ENC derived from cross-validation (CV), Ensembles based on 100 subsamples: PS: Meinshausen & Bühlmann’s algorithm with a single λ optimal =λ MB selected to minimize the expected number of false positives, LS: Single λ optimal =λ ENC with no variable selection, SS: Stability selection across the entire 100 λΛ grid with no variable selection, Ensembles based on 100 resamples: PR, LR, SR: Identical to PS, PR and LR, respectively, with model space constructed through resampling. BMA: Bayesian model averaging with expected model size (EMS) = 1, BMAC: BMA with EMS determined by CV (E M S CV ).