Skip to main content

Table 2 Comparison on F-Score (FS), Entropy (E) and Rand Index (RI)

From: Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata

Key (ref. cluster number)

Weights

Our algorithm

K-medoid

DBSCAN

APCluster

StdHier

 

α

β

γ

FS

E

RI

FS

E

RI

FS

E

RI

FS

E

RI

FS

E

RI

Age (2)

.44

.01

.55

.94

.34

.87

.86

.51

.67

.87

.43

.69

.68

.59

.54

.81

.60

.63

Cell line (4)

.65

.11

.24

.46

.78

.56

.60

.78

.54

.49

.78

.40

.59

.70

.64

.52

.82

.43

Disease (4)

.15

.18

.67

.58

.55

.65

.64

.58

.61

.63

.69

.36

.67

.63

.52

.61

.58

.63

Strain (4)

.85

.00

.15

.58

.69

.62

.43

.68

.61

.50

.76

.35

.42

.68

.46

.48

.78

.35

Tissue (9)

.80

.00

.20

.43

.73

.37

.41

.69

.56

.49

.77

.27

.35

.74

.58

.40

.68

.45

Treatment (4)

.57

.00

.43

.78

.41

.74

.69

.58

.67

.76

.69

.47

.68

.69

.50

.81

.58

.66

Average

.63

.58

.64

.61

.64

.61

.62

.69

.42

.57

.67

.54

.60

.67

.52

  1. A higher F-Score, a higher Rand Index or a lower entropy indicates a better quality, and the best ones are formatted as bold