Skip to main content

Table 5 Relative importance of variables in random forest models as measured by Gini index (higher is more important).

From: A classification model for distinguishing copy number variants from cancer-related alterations

 

CBS smoothed

 

GLAD

CBS unsmoothed

Variable

w. DS

w/o DS

w. DS

w/o DS

w. DS

w/o DS

Relative height

60.33

79.86

81.26

100.96

101.99

119.94

Break

39.98

51.92

48.57

65.75

60.33

72.19

Close to other candidates

8.27

10.83

4.18

6.90

6.19

8.20

Overlap with CNAs

17.21

24.15

17.87

27.10

26.16

34.09

Database score

165.44

 

206.34

 

108.89

 

Overlap w. other pts: %

70.11

107.05

95.23

129.03

74.62

93.67

Matching bkpts in other: %

86.67

108.45

94.02

135.80

116.34

116.83

Overlap with other pts

39.91

59.70

42.18

68.84

86.42

98.28

Closeness to centromere

4.37

5.82

3.74

6.07

4.64

7.17

Closeness to telomere

4.65

6.53

3.31

5.52

6.04

7.11

Length

112.20

133.69

64.26

92.25

170.78

182.46

Dat. score of other cand.

30.23

 

32.57

 

44.51

 

Percent of Normal

64.10

84.05

55.97

76.26

77.18

93.82

Segmental duplication

3.19

7.77

3.48

9.53

5.16

9.34

Sign

7.84

11.01

8.19

12.42

22.59

27.77

Surrounded by Normals

1.87

4.67

5.03

6.66

3.34

5.21

  1. "'W. DS"' stands for the model that includes Database score (DS), "'w/o DS"' stands for the model where it was excluded.