Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: CoSTA: unsupervised convolutional neural network learning for spatial transcriptomics analysis

Fig. 1

CoSTA model approach and motivation. A Overall CoSTA pipeline. Inputs are gene matrices from spatial transcriptomic experiments. ConvNet stage forwards images through 3 convolutional layers and then flattens the output into a spatial representation vector. UMAP reduces dimensionality of the spatial representations from the ConvNet stage before these gene representations are used to cluster genes with GMM. Each gene is then assigned cluster probabilities based on distances to cluster centroids, which are transformed to an auxiliary target distribution that can be minimized by reducing bi-tempered logistic loss and/or center loss. Gradients are backpropagated through a fully connected layer to ConvNet. The process is repeated until the model converges, at which point the output from the ConvNet is used as the final spatial representation (red arrow). B Biologically-inspired example in which overlap does not capture all aspects of spatial pattern similarity. Rectangles represent an epithelial cell layer while ovals represent stromal cells. By overlap comparison, Gene 1 has the same similarity to both Gene 2 and Gene 3 (40% overlap). However, the biologically relevant expression along the epithelial layer is only shared between Gene 1 and Gene 2. Detecting this shape similarity requires learning a spatial representation. C Performance of CoSTA in synthetic datasets. Left panel: 5 real expression patterns in mouse olfactory bulb data replicate 11. We generated 2,000 simulated gene expression matrices for each pattern with different levels of noise. Right panel: learning curves of CoSTA classifying simulated genes belonging to these 5 patterns with different noise levels. Normalized Mutual Information (NMI) values quantify the similarity between clustering labels assigned by CoSTA and the true class label across all 5 patterns

Back to article page