Skip to main content

Table 3 Classes of carcinomas used for random forest prediction of cancer types

From: Integrative analysis and machine learning on cancer genomics data using the Cancer Systems Biology Database (CancerSysDB)

Class name TCGA cohorts Sample size
Total Training set Test set
Adrenal gland Adrenocortical carcinoma (ACC) 271 179 92
Pheochromocytoma and paraganglioma (PCPG)
Bladder Urothelial carcinoma (BLCA) 411 272 139
Brain Lower grade glioma (LGG) 515 340 175
Breast Breast invasive carcinoma (BRCA) 1077 711 366
Gastrointestinal Esophageal carcinoma (ESCA) 1237 817 420
Stomach adenocarcinoma (STAD)
Colon adenocarcinoma (COAD)
Rectum adenocarcinoma (READ)
Cholangiocarcinoma (CHOL)
Head & Neck Head and neck squamous cell carcinoma (HNSC) 590 390 200
Uveal melanoma (UVM)
Hematologic Acute myeloid leukemia (LAML) 321 212 109
Diffuse large B-cell lymphoma (DLBC)
Thymoma (THYM)
Kidney Kidney Chromophobe (KICH) 738 488 250
Renal clear cell carcinoma (KIRC)
Renal papillary cell carcinoma (KIRP)
Liver Hepatocellular carcinoma (LIHC) 321 212 109
Ovary Ovarian serous cystadenocatcinoma (OV) 437 289 148
Pancreas Pancreatic adenocarcinoma (PAAD) 184 122 62
Prostate Prostate adenocarcinoma (PRAD) 498 329 169
Skin Cutaneous melanoma (SKCM) 104 69 35
Testis Testicular germ cell tumors (TGCT) 150 99 51
Thoracic Lung adenocarcinoma (LUAD) 1143 755 388
Lung squamous cell carcinoma (LUSC)
Mesothelioma (MESO)
Thyroid Thyroid carcinoma (THCA) 496 327 169
Uterus Uterine carcinosarcoma (UCS) 598 395 203
Uterine corpus endometrial carcinoma (UCEC)