Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Hypoxia classifier for transcriptome datasets

Fig. 1

Generating expression based tree classifiers to identify hypoxic samples. A Decision trees generation overview. 425 RNA-seq samples exposed to normoxia or hypoxia were processed to produce a ranking set of genes from each of one them. A subset of the resulting data matrix, consisting in the 178 genes significantly up-regulated by hypoxia according to ref [24], was used as input to a feature selection algorithm. The 20 genes showing an MDA>4 were then selected to generate 10000 random trees and the 276 trees showing an accuracy over 95% in cross validation were selected as classifiers. Finally, a set of challenging datasets not used in the generation nor training steps, were used to test the performance of the 276 trees and select the best overall tree and two additional substitutes. B mean decrease in accuracy index of the 20 most important genes according to 1000 random forest iterations. C Frequency of each gene being used as a predictor variable in the classification trees. D Split points for the rank percentile (100 being the most expressed gene, 0, the least) of the genes used in all the models with accuracy > 0.95. E Split points for the rank percentile of the genes used in the 10 best performing models according to cross-validation accuracy

Back to article page