Skip to main content

Table 2 Comparison of results for varying numbers of inferred topics

From: Inferring functional modules of protein families with probabilistic topic models

k

Mean # OGs associated with the modules

Mean module size

Mean coverage for k modules

Stable modules with ≥5 OGs and coverage ≥50%

 

Average

s.d.

Average

s.d.

Average

s.d.

 

100

635

28.6

13.82

0.27

64%

1%

33 (33%)

200

1560

24.9

19.47

0.11

58%

1%

66 (33%)

300

2009

31.1

21.76

0.9

56.7%

0.6%

68 (22.7%)

400

2223

33.9

22.9

0.42

53%

1%

102 (25.5%)

500

2378

7

22.34

0.19

49%

0%

97 (19.4%)

  1. We performed 5 experiments to test different settings of k. The reported numbers for each experiment are averaged mean values of three runs. We used a greedy approach to track module identities across the runs for each setting of k, and refer to the modules identified in all three runs as the stable modules. The last column gives the number of stable modules with sufficient evidence for functional coherence based on the STRING analysis (in parentheses, we denote the fraction of stable modules satisfying the conditions, with respect to k).