Skip to main content

Table 1 Comparison of Differential Variability (DV) and t-test feature selection algorithms on DNAm data

From: Stochastic epigenetic outliers can define field defects in cancer

Feature selection algorithm CIN2+ risk (27 k) CIN2+ (27 k) CC (27 k) NADJ (450 k) BC (450 k)
Moderated t-tests 0 2456 (10 %) 13009 (50 %) 0 345479 (71 %)
Bartlett-test (BT) 1584 (7 %) 3475 (15 %) 17846 (69 %) 99913 (21 %) 400689 (82 %)
IEVORA 1584 (7 %) 3475 (15 %) 17846 (69 %) 99913 (21 %) 400689 (82 %)
DiffVar 0 202 (<1 %) 8928 (35 %) 2051 (<1 %) 268027 (55 %)
J-DMDV 0 1973 (8 %) 11632 (45 %) 0 416995 (86 %)
GAMLSS2 1045 (4 %) 3263 (14 %) 16626 (64 %) 37106 (<1 %) 434657 (89 %)
  1. The rows label the name of the feature selection algorithm, the number of identified features associated with different phenotypes at an FDR < 0.05. The phenotypes considered are prospective risk of CIN2+ (i.e. precursor CIN2+ lesions, n = 75), CIN2+ (cervical intraepithelial neoplasia of grade 2 or higher, n = 24), CC (cervical cancer, n = 48), normal breast tissue adjacent to a breast cancer (NADJ, n = 42), and breast cancer (BC,n = 305). In the context of the cervix, the reference phenotype were normal cervical samples profiled in each study (n = 77, 24 and 15, respectively). In the context of breast, the reference were 50 normal breast tissue samples from healthy women. We note that since Bartlett’s-test and IEVORA only differ in the ranking order of significant features, that their values here are identical. In boldface we indicate the algorithm(s) identifying most DVCs in each of the two normal-to-normal comparisons