Skip to main content

Table 1 Distribution of the cancer and normal samples in the dataset used to build the predictive model (TCGA cancer classifier)

From: MicroRNA based Pan-Cancer Diagnosis and Treatment Recommendation

Organ/System

Cancer

Cancer acronym

Normal samples

Cancer samples

Thymus

Thymoma

THYM

2

124

Lung

Lung Squamous Cell Carcinoma

LUSC

45

342

Pancreas

Pancreatic Adenocarcinoma

PAAD

4

179

GI tract

Cholangiocarcinoma

CHOL

9

36

Esophageal Carcinoma

ESCA

13

185

Stomach Adenocarcinoma

STAD

41

395

Liver

Liver Hepatocellular Carcinoma

LIHC

50

374

Thyroid

Thyroid Carcinoma

THCA

59

510

Adipose

Adrenocortical carcinoma

ACC

0

80

Lymph

Diffuse Large B-cell Lymphoma

DLBC

0

47

Heart

Mesothelioma

MESO

0

87

Reproductive

Cervical Squamous Cell and Endocervical Adenocarcinoma

CESC

3

309

Ovarian Serous Cystadenocarcinoma

OV

0

461

Testicular Germ Cell Tumors

TGCT

0

156

Urinary

Uterine Carcinosarcoma

UCS

0

56

Kidney

Kidney Chromophobe

KICH

25

66

Kidney Renal Papillary cell carcinoma

KIRP

34

292

Brain

Brain Lower Grade Glioma

LGG

0

526

Peripheral Nervous System

Pheochromocytoma and Paraganglioma

PCPG

3

184

Epidermis

Skin Cutaneous Melanoma

SKCM

2

450

Uveal Melanoma

UVM

0

80

  1. Note that not all the cancer types have normal samples. Even though the TCGA dataset has about 33 cancer types, many cancer types were removed due to lack of data (less than 5 samples) and in the end 21 cancer types as listed in this table were used for the classification