Skip to main content

Table 2 Comparison of results for varying numbers of inferred topics

From: Inferring functional modules of protein families with probabilistic topic models

k Mean # OGs associated with the modules Mean module size Mean coverage for k modules Stable modules with ≥5 OGs and coverage ≥50%
  Average s.d. Average s.d. Average s.d.  
100 635 28.6 13.82 0.27 64% 1% 33 (33%)
200 1560 24.9 19.47 0.11 58% 1% 66 (33%)
300 2009 31.1 21.76 0.9 56.7% 0.6% 68 (22.7%)
400 2223 33.9 22.9 0.42 53% 1% 102 (25.5%)
500 2378 7 22.34 0.19 49% 0% 97 (19.4%)
  1. We performed 5 experiments to test different settings of k. The reported numbers for each experiment are averaged mean values of three runs. We used a greedy approach to track module identities across the runs for each setting of k, and refer to the modules identified in all three runs as the stable modules. The last column gives the number of stable modules with sufficient evidence for functional coherence based on the STRING analysis (in parentheses, we denote the fraction of stable modules satisfying the conditions, with respect to k).