Skip to main content

Advertisement

Table 5 Relative importance of variables in random forest models as measured by Gini index (higher is more important).

From: A classification model for distinguishing copy number variants from cancer-related alterations

  CBS smoothed   GLAD CBS unsmoothed
Variable w. DS w/o DS w. DS w/o DS w. DS w/o DS
Relative height 60.33 79.86 81.26 100.96 101.99 119.94
Break 39.98 51.92 48.57 65.75 60.33 72.19
Close to other candidates 8.27 10.83 4.18 6.90 6.19 8.20
Overlap with CNAs 17.21 24.15 17.87 27.10 26.16 34.09
Database score 165.44   206.34   108.89  
Overlap w. other pts: % 70.11 107.05 95.23 129.03 74.62 93.67
Matching bkpts in other: % 86.67 108.45 94.02 135.80 116.34 116.83
Overlap with other pts 39.91 59.70 42.18 68.84 86.42 98.28
Closeness to centromere 4.37 5.82 3.74 6.07 4.64 7.17
Closeness to telomere 4.65 6.53 3.31 5.52 6.04 7.11
Length 112.20 133.69 64.26 92.25 170.78 182.46
Dat. score of other cand. 30.23   32.57   44.51  
Percent of Normal 64.10 84.05 55.97 76.26 77.18 93.82
Segmental duplication 3.19 7.77 3.48 9.53 5.16 9.34
Sign 7.84 11.01 8.19 12.42 22.59 27.77
Surrounded by Normals 1.87 4.67 5.03 6.66 3.34 5.21
  1. "'W. DS"' stands for the model that includes Database score (DS), "'w/o DS"' stands for the model where it was excluded.