Skip to main content

Table 4 Clustering results on other most frequent keys

From: Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata

Keys Key frequency Cluster number Max. Min. Avg.
    Key number per cluster
Gender 188,277 4 17 2 11
Cell type 137,192 5 14 1 6
Genotype 100,876 5 28 20 22
Time 100,462 14 241 3 29
Sex 67,529 4 16 4 8
  1. Key frequency denotes the number of key-value pairs that include that particular key