Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Leveraging TCGA gene expression data to build predictive models for cancer drug response

Fig. 1

Scheme of data division throughout our study. Shown here is the workflow for training and validating the 5-FU model. The same method applies to the GCB model. a Gene expression data for five cancer types with 5-FU drug response data were downloaded from TCGA. b Three steps were performed on the training data made up from all five cancer types: clustering via OptCluster, feature selection via random forest and prediction via random forest with cross-validation to train the model. c Model validation was performed on half of the most populous cancer type (STAD) held out as an independent validation data set

Back to article page