Skip to main content

Table 2 K-means clustering accuracy and running time of Salmonella sequence dataset

From: A heuristic approach to determine an appropriate number of topics in topic modeling

T

5

10

20

30

40

50

Purity** (k = 10)

0.95

0.93

0.96

0.96

0.93

0.93

Time(ms)

33,914

34,584

34,824

35,478

35,636

35,816

T

60

70

80

90

100

 

Purity (k = 10)

0.93

0.93

0.93

0.93

0.93

 

Time(ms)

36,143

36,365

36,517

36,636

36,969

 
  1. **Purity of each cluster is calculated as the ratio of correctly classified strains in the total 119 strains in the cluster. The ratios in the table represent the average purities of k clusters obtained for each topic modeling.