Skip to main content

Table 1 Overview of the data sets used in the comparison study

From: Block Forests: random forests for blocks of clinical and omics covariate data

Name Cancer type Sample size Uncensored observations
BLCA Bladder Urothelial Carcinoma 310 32%
BRCA Breast Invasive Carcinoma 863 9%
CESC Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma 206 15%
COAD Colon Adenocarcinoma 350 22%
ESCA Esophageal Carcinoma 121 21%
GBM Glioblastoma Multiforme 154 73%
HNSC Head and Neck Squamous Cell Carcinoma 411 35%
KIRC Kidney Renal Clear Cell Carcinoma 322 22%
KIRP Kidney Renal Papillary Cell Carcinoma 249 10%
LGG Brain Lower Grade Glioma 454 21%
LIHC Liver Hepatocellular Carcinoma 298 28%
LUAD Lung Adenocarcinoma 424 30%
LUSC Lung Squamous Cell Carcinoma 365 39%
OV Ovarian Serous Cystadenocarcinoma 261 54%
PAAD Pancreatic Adenocarcinoma 142 49%
PRAD Prostate Adenocarcinoma 425 2%
READ Rectum Adenocarcinoma 138 16%
SARC Sarcoma 183 16%
SKCM Skin Cutaneous Melanoma 264 25%
STAD Stomach Adenocarcinoma 284 27%
UCEC Uterine Corpus Endometrial Carcinoma 503 13%
  1. The following information is given: Name of the data set, cancer type, sample size and the percentage of observations for which the survival time was uncensored. Note that the TCGA Project ID of each data set is “TCGA-[Name]”, with “[Name]” being the name of the data set (given in the first column)