Skip to main content

Advertisement

Table 1 Synthetic response data, predictive errors from held-out test data

From: Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology

  Simulation 1 Simulation 2 Simulation 3
  MA MAP MA MAP MA MAP
BVS: EB prior 0.819±0.004 0.850±0.004 0.837±0.004 0.889±0.005 0.899±0.002 0.918±0.002
BVS: flat prior 0.845±0.004 0.919±0.005 0.845±0.004 0.919±0.006 0.904±0.002 0.927±0.003
BVS: ‘incorrect’ prior 0.858±0.003 0.895±0.003 0.918±0.003 1.003±0.004 0.969±0.003 1.036±0.003
BVS: MRF prior 0.830±0.004 0.877±0.005 0.871±0.004 0.920±0.006 0.886±0.002 0.911±0.002
Lasso 0.791±0.003 0.790±0.003 0.913±0.002
Li&Li 1.246±0.009 1.476±0.012 1.760±0.012
Baseline linear 1.000±0.002 1.000±0.002 1.000±0.002
  1. Predictions using small-sample training data (n = 35) and held-out test data (n = 818; total of 5,000 train/test pairs) for Simulations 1, 2 and 3. Results shown are mean absolute predictive errors ± SEM for the following methods: Bayesian variable selection (BVS) with biologically informative pathway-based prior with source and strength parameters set by empirical Bayes, BVS with flat prior, BVS with ‘incorrect’ prior (contradicting empirical Bayes; see text for details), BVS with a Markov random field (MRF) prior, Lasso regression, penalised-likelihood approach proposed by Li and Li[21], and a baseline linear regression without interaction terms including all 11 predictors. For BVS, predictions made using the posterior predictive distribution with exact model averaging (‘MA’) and using the maximum a posteriori model (‘MAP’).
  2. linear model with interaction terms for Simulations 1 and 2, and without interaction terms for Simulation 3.