Skip to main content

Table 2 Different criteria for filtering clusters for function prediction

From: Mining phenotypes for gene function prediction

 

(Filter 1)

(Filter 1 & Filter 2)

(Filter 1 & Filter 3)

(Filter 1 & Filter 4)

(Filter 1 & Filter 5)

# of groups

196

74

53

185

11

# of terms

345

159

102

338

16

# of genes

3213

711

409

2895

320

Precision

67.91%

62.52%

60.52%

67.73%

64.70%

Recall

22.98%

26.16%

19.78%

23.80%

11.21%

  1. In order to push the values for precision and recall towards the precision ceiling, we strived for filter criteria for selecting appropriate gene groups a-priori. To achieve this goal, we defined the following filter criteria for our 1,000 'phenoclusters':
  2. Filter 1: Removes groups with less than 3 genes, no GO-terms associated to at least 50% of genes
  3. Filter 2: Removes groups with a GO-similarity score < 0.4
  4. Filter 3: Removes groups with a PPi-connectedness < 33%.
  5. Filter 4: removes all non-single species clusters.
  6. Filter 5: removes all single-species clusters