Skip to main content

Table 1 Synthetic response data, predictive errors from held-out test data

From: Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology

 

Simulation 1

Simulation 2

Simulation 3

 

MA

MAP

MA

MAP

MA

MAP

BVS: EB prior†

0.819±0.004

0.850±0.004

0.837±0.004

0.889±0.005

0.899±0.002

0.918±0.002

BVS: flat prior†

0.845±0.004

0.919±0.005

0.845±0.004

0.919±0.006

0.904±0.002

0.927±0.003

BVS: ‘incorrect’ prior†

0.858±0.003

0.895±0.003

0.918±0.003

1.003±0.004

0.969±0.003

1.036±0.003

BVS: MRF prior†

0.830±0.004

0.877±0.005

0.871±0.004

0.920±0.006

0.886±0.002

0.911±0.002

Lasso†

0.791±0.003

0.790±0.003

0.913±0.002

Li&Li

1.246±0.009

1.476±0.012

1.760±0.012

Baseline linear

1.000±0.002

1.000±0.002

1.000±0.002

  1. Predictions using small-sample training data (n = 35) and held-out test data (n = 818; total of 5,000 train/test pairs) for Simulations 1, 2 and 3. Results shown are mean absolute predictive errors ± SEM for the following methods: Bayesian variable selection (BVS) with biologically informative pathway-based prior with source and strength parameters set by empirical Bayes, BVS with flat prior, BVS with ‘incorrect’ prior (contradicting empirical Bayes; see text for details), BVS with a Markov random field (MRF) prior, Lasso regression, penalised-likelihood approach proposed by Li and Li[21], and a baseline linear regression without interaction terms including all 11 predictors. For BVS, predictions made using the posterior predictive distribution with exact model averaging (‘MA’) and using the maximum a posteriori model (‘MAP’).
  2. ‡ linear model with interaction terms for Simulations 1 and 2, and without interaction terms for Simulation 3.