Skip to main content

Table 2 Number of clusters and mean accuracy for pan-cancer models

From: Leveraging TCGA gene expression data to build predictive models for cancer drug response

  Fluorouracil Gemcitabine
Clustering Method # Considered Clustersb # Selected Clustersc Average Accuracy # Considered Clustersb # Selected Clustersc Average Accuracy
Optcluster (Clara) with RFa 204 32 84.1% 192 50 82.3%
Optcluster (Clara) with SVMa 204 32 81.0% 192 50 71.5%
Optcluster (Clara) with logistic regression 204 32 77.0% 192 50 73.0%
Optcluster (Clara) with RF and demographics 204 32 83.6% 192 50 79.1%
Model Validation (Clara) 204 32 52.9% 192 50 82.1%
Model Validation
(Clara with Tuning)
204 32 52.9% 192 50 85.7%
  1. aRF is for random forest; SVM is for support vector machine
  2. bNumber of considered clusters represents the number of clusters entered into random forest for variable importance ranking
  3. cNumber of selected clusters is number of clusters selected by random forest for classification