Skip to main content

Table 5 The effect of class distribution on the performance of the classifier.

From: Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data

Class distribution (PR-enriched : non-PR-enriched)

Ac (%)

PR enriched

non-PR enriched

  

Pr (%)

Se (%)

Sp (%)

Pr (%)

1:1

91.2

99.5

82.8

99.6

85.2

2:1

89.8

95.7

85.8

94.9

83.6

3:1

89.1

94.3

88.9

89.6

80.7

4:1

84.1

88.8

91.2

58.3

64.6

Original (261:63)

81.2

86.2

91.2

39.7

52.1

  1. The class distribution is obtained based on random over sampling method. A 10-fold cross validation was carried out to estimate the true classification error. Class distribution is represented as the number of PR-enriched tags against the number of non-PR-enriched tags. There are 261 PR-enriched and 63 non-PR-enriched tags in the original dataset. In all resampled datasets, the number of PR-enriched tags is as also equal to 261.