Skip to main content

Table 4 Summary of Overall Experiment Results on MEDLINE Document Sets

From: A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

 

STC

K-means

Original Bisecting K-means [25]

CLUTO Bisecting K-means

CSUGAR

 

word strings

concept strings

  

Largest

LOS

 

MI

μ: 0.429

σ: 0.238

μ: 0.359

σ: 0.149

μ: 0.128

σ: 0.148

μ: 0.395

σ: 0.193

μ: 0.161

σ: 0.139

μ: 0.096

σ: 0.112

μ : 0.053

σ : 0.031

Purity

μ: 0.601

σ: 0.214

μ: 0.731

σ: 0.098

μ: 0.932

σ: 0.080

μ: 0.666

σ: 0.154

μ: 0.918

σ: 0.064

μ: 0.944

σ: 0.056

μ : 0.947

σ : 0.030

F-measure

μ: 0.499

σ: 0.285

μ: 0.512

σ: 0.198

μ: 0.828

σ: 0.206

μ: 0.532

σ: 0.236

μ: 0.780

σ: 0.180

μ: 0.880

σ: 0.139

μ : 0.926

σ : 0.062

  1. LOS: selecting the cluster (to be bisected) with the least overall similarity and Largest: selecting the largest cluster to be bisected. MI: the smaller, the better clustering quality. Purity and F-measure: the bigger, the better clustering quality.