Skip to main content

Table 4 Clustering results on other most frequent keys

From: Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata

Keys

Key frequency

Cluster number

Max.

Min.

Avg.

   

Key number per cluster

Gender

188,277

4

17

2

11

Cell type

137,192

5

14

1

6

Genotype

100,876

5

28

20

22

Time

100,462

14

241

3

29

Sex

67,529

4

16

4

8

  1. Key frequency denotes the number of key-value pairs that include that particular key