Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training

Fig. 5

Preprocessing workflow for extracting the training and testing primary and metastatic tumor samples. After downloading the TCGA and MET500 datasets, data preprocessing, including filtering genes and merging related cancer types, was performed. 4,858 common genes in TCGA and MET500 were retained and then samples were divided into training and testing sets according to the number of primary cancer types in TCGA. Cancer types in italic font are merged tumor groups, as described in the Methods section. Three training datasets for primary cancers were created and included 9 (blue labeled), 14 (blue + red labeled), and all 19 cancer types, respectively

Back to article page