Skip to main content
Fig. 10 | BMC Bioinformatics

Fig. 10

From: Evaluating imputation methods for single-cell RNA-seq data

Fig. 10

Summary of the performance of the imputation methods. a The performance of different methods on each dataset. b The overall performance of different methods on real and simulated datasets. In both a and b, red and blue grids correspond to better and worse performance, respectively. Six metrics on three evaluation tasks are shown in the columns, namely median absolute imputation error ‘median’, mean absolute imputation error ‘mean’, ARI score ‘ARI’, silhouette score based on the ground truth ‘sil(g)’, silhouette score based on SC3 clustering results ‘sil(s)’, and silhouette score based on marker genes ‘marker’. Scores in each column were normalized by subtracting the baseline (data before imputation) score, and then dividing by the difference between the maximum and the minimum score. Opposite scores were used for the ‘median’ and ‘mean’, as lower imputation errors indicate better performance. The methods were categorized as ‘Statistical’ or ‘Non-statistical’, according to their principles. In (b), the scores in each grid are the averages across all (real or simulated) datasets. The methods were ranked by the ‘overall’ score, which is a weighted sum of the metrics, with weights of \(\frac{1}{6}, \frac{1}{6}, \frac{1}{9}, \frac{1}{9}, \frac{1}{9}, \frac{1}{3}\) for real datasets, and \(\frac{1}{4}, \frac{1}{4}, \frac{1}{6}, \frac{1}{6}, \frac{1}{6}\) for simulated datasets, so that each task can make an identical contribution

Back to article page