Skip to main content

Table 4 Count matrix sparsity for the three simulated datasets

From: Beware to ignore the rare: how imputing zero-values can improve the quality of 16S rRNA gene studies results

Imputation

Dataset 1

True sparsity: 63.03%

Raw data sparsity: 72.56%

Dataset 2

True sparsity: 56.61%

Raw data sparsity: 67.91%

Dataset 3

True sparsity: 91.26%

Raw data sparsity: 94.34%

Mean (%)

SD (%)

Mean (%)

SD (%)

Mean (%)

SD (%)

None

72.56

0.00

67.91

0.00

94.34

0.00

scImpute

69.50

0.01

55.84

0.00

87.07

0.01

DrImpute

62.67

0.04

42.32

0.00

80.03

1.42

LLSimpute

21.94

3.25

23.19

1.75

23.10

1.71

zCompositions_SQ

0.00

0.00

0.00

0.00

0.00

0.00

zCompositions_CZM

0.00

0.00

0.00

0.00

0.00

0.00

  1. Preprocessed datasets sparsity were aggregated according to the zero-imputation method included in the pipeline, reporting the mean and standard deviation calculated across different normalization approaches. Ground truth and raw data sparsity for each dataset are reported in the table header row. “None” identifies pipelines where no zero-imputation step was performed, i.e. normalization-only pipelines