Skip to main content
Fig. 11 | BMC Bioinformatics

Fig. 11

From: Data-driven human transcriptomic modules determined by independent component analysis

Fig. 11

Classification performance at various subsampling ratios. 100 independent simulation runs were performed, each using an independently selected held-out test set. For each run, 200 repeats were performed using different training sets, and we calculated the mean performance metrics across the repeats. At the lowest subsampling percentage (5%), a training set would consist of five C3 and ten C8 samples, both randomly chosen. The performance metrics, averaged over the runs are: (a) Positive Predictive Value (i.e. precision), (b) Negative Predictive Value, (c) Sensitivity (i.e. recall), (d) Specificity, (e) Accuracy, and (f) the amount of agreement between FC and gene based models. Error bars here indicate the standard deviations (across the 100 runs) for the particular metric. *For eleven of the simulation runs (i.e. test sets) at the subsampling percentage of 5%, the gene-space model predicted all negatives in at least one sampling, resulting in an undefined PPV. It should be noted that the FC-based model consistently provided predictions for both classes across all runs; the average PPV for the FC-based model across those eleven runs was 0.714

Back to article page