Skip to main content

Table 2 Number of clusters and mean accuracy for pan-cancer models

From: Leveraging TCGA gene expression data to build predictive models for cancer drug response

 

Fluorouracil

Gemcitabine

Clustering Method

# Considered Clustersb

# Selected Clustersc

Average Accuracy

# Considered Clustersb

# Selected Clustersc

Average Accuracy

Optcluster (Clara) with RFa

204

32

84.1%

192

50

82.3%

Optcluster (Clara) with SVMa

204

32

81.0%

192

50

71.5%

Optcluster (Clara) with logistic regression

204

32

77.0%

192

50

73.0%

Optcluster (Clara) with RF and demographics

204

32

83.6%

192

50

79.1%

Model Validation (Clara)

204

32

52.9%

192

50

82.1%

Model Validation

(Clara with Tuning)

204

32

52.9%

192

50

85.7%

  1. aRF is for random forest; SVM is for support vector machine
  2. bNumber of considered clusters represents the number of clusters entered into random forest for variable importance ranking
  3. cNumber of selected clusters is number of clusters selected by random forest for classification