Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: PROMO: an interactive tool for analyzing clinically-labeled multi-omic cancer datasets

Fig. 4

Identification and characterization of cancer subtypes. Unsupervised analysis followed by enrichment analysis is performed on both samples and features for identifying clinically significant samples groups, and for biologically characterizing them based on the functions of co-expressed gene groups. a The RNA-Seq expression matrix of TCGA’s breast cancer cohort after clustering both samples (columns) and genes (rows) into four clusters using the K-means algorithm. Clustering is based on the top 2000 variable genes. White lines separate clusters in each dimension. The bars below the matrix show selected sample labels (here: the clustering and PAM50). Matrix and bars were created using PROMO’s multi-label matrix drawing. b Gene clusters were characterized using PROMO’s gene ontology enrichment tool. The figure shows the five most significant GO terms for every gene cluster. c-e Sample clusters were characterized using the sample clinical labels: c PROMO’s multi-label analysis tool automatically tests the clinical labels of different types (numeric, ordinal, categorical or survival) for enrichment on the sample clusters. FDR correction is performed over all clinical labels of the same type but separately for different types. The various d Sample clusters can also be characterized for a single label by showing its value distribution in each cluster and by calculating enrichment. e Survival functions for each cluster. The p-values are the significance of the separation of each cluster from the rest using the log-rank test

Back to article page