Skip to main content

Table 5 The effect of class distribution on the performance of the classifier.

From: Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data

Class distribution (PR-enriched : non-PR-enriched) Ac (%) PR enriched non-PR enriched
   Pr (%) Se (%) Sp (%) Pr (%)
1:1 91.2 99.5 82.8 99.6 85.2
2:1 89.8 95.7 85.8 94.9 83.6
3:1 89.1 94.3 88.9 89.6 80.7
4:1 84.1 88.8 91.2 58.3 64.6
Original (261:63) 81.2 86.2 91.2 39.7 52.1
  1. The class distribution is obtained based on random over sampling method. A 10-fold cross validation was carried out to estimate the true classification error. Class distribution is represented as the number of PR-enriched tags against the number of non-PR-enriched tags. There are 261 PR-enriched and 63 non-PR-enriched tags in the original dataset. In all resampled datasets, the number of PR-enriched tags is as also equal to 261.