From: Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata
Keys | Key frequency | Cluster number | Max. | Min. | Avg. |
---|---|---|---|---|---|
 |  |  | Key number per cluster | ||
Gender | 188,277 | 4 | 17 | 2 | 11 |
Cell type | 137,192 | 5 | 14 | 1 | 6 |
Genotype | 100,876 | 5 | 28 | 20 | 22 |
Time | 100,462 | 14 | 241 | 3 | 29 |
Sex | 67,529 | 4 | 16 | 4 | 8 |