Skip to main content

Table 1 TCGA binary tasks

From: Standard machine learning approaches outperform deep representation learning on phenotype prediction from transcriptomics data

Project

Disease

Label

Label type

Group

Samples

TCGA stage tasks

COAD

colon adenocarcinoma

II- vs. III+

binary

train

505

KIRC

kidney renal clear cell carcinoma

II- vs. III+

binary

train

544

LIHC

liver hepatocellular carcinoma

I- vs. II+

binary

train

374

LUAD

lung adenocarcinoma

I- vs. II+

binary

train

542

SKCM

skin cutaneous melanoma

II- vs. III+

binary

train

249

STAD

stomach adenocarcinoma

II- vs. III+

binary

train

416

THCA

thyroid cancer

I- vs. II+

binary

train

513

UCEC

uterine corpus endometrial carcinoma

I- vs. II+

binary

train

554

LUSC

lung squamous cell carcinoma

I- vs. II+

binary

validate

504

BRCA

breast invasive carcinoma

II- vs. III+

binary

test

1134

TCGA grade tasks

CESC

cervical squamous cell carcinoma

II- vs. III+

binary

train

306

KIRC

kidney renal clear cell carcinoma

II- vs. III+

binary

train

544

LGG

low grade glioma

II- vs. III+

binary

train

532

LIHC

liver hepatocellular carcinoma

II- vs. III+

binary

train

374

PAAD

pancreatic adenocarcinoma

II- vs. III+

binary

train

179

STAD

stomach adenocarcinoma

II- vs. III+

binary

train

416

UCEC

uterine corpus endometrial carcinoma

II- vs. III+

binary

train

554

HNSC

head-neck squamous cell carcinoma

II- vs. III+

binary

test

504

  1. The 18 binary tasks derived from TCGA used to train supervised models and validate the unsupervised embeddings. The tasks are grouped into two categories, TCGA tumor stage tasks (10), and TCGA tumor grade tasks (8). The project names correspond to those in Fig. 2