GO for gene documents

BMC Bioinformatics

Table 7 Performance by Number of Positives for Training. This table shows results that are somewhat counter to expectation. For example, with BP there is a tendency for performance to drop with increasing numbers of positive example documents in the training set.

Training size	# codes	MF-FScore	# codes	CC-FScore	# codes	BP-FScore
5	2	0.25	34	0.4067	128	0.2695
6–10	3	0.0833	25	0.3650	65	0.3875
11–15	9	0.4373	7	0.4528	22	0.3716
16–20	37	0.5645	4	0.4550	15	0.3306
21–25	39	0.544	3	0.4762	9	0.2588
26–30	31	0.5566	4	0.3687	4	0.3007
31–35	6	0.4663	3	0.5651	8	0.3566
36–40	7	0.5275	0	0	6	0.3579
41–45	10	0.4124	1	0.2009	5	0.3484
46–50	11	0.4276	1	0.2861	2	0.2553
51–75	18	0.3912	2	0.3430	12	0.3060
76–100	12	0.3936	1	0.2681	6	0.2726
101–125	5	0.4273	2	0.4089	0	0
126–150	4	0.4767	2	0.3226	0	0
151–last	20	0.3511	4	0.4586	1	0.2822

ISSN: 1471-2105