GOTA: GO term annotation of biomedical literature

BMC Bioinformatics

Table 1 Performances over a test set of 15,000 publications

Method ^a	Info ^b	IT ^c		CAFA ^c	BC ^c	TREC ^c
		i P ₁	i R ₁₀	h F _max	h R ₁₀	M R R ₁₀	R ₁₀
GOTA	PM	0.43	0.64	0.43	0.69	0.40	0.46
GOTA	T+A	0.42	0.64	0.43	0.68	0.39	0.45
GOTA	T	0.41	0.63	0.42	0.68	0.39	0.44
RandFR	N/A	0.20	0.33	0.20	0.33	0.18	0.15
RandIC	N/A	0.21	0.27	0.18	0.31	0.03	0.08
GOTA Φ _P	PM	0.37	0.64	0.41	0.67	0.38	0.44
GOTA Φ _P	T+A	0.35	0.62	0.40	0.66	0.36	0.41
GOTA Φ _P	T	0.35	0.62	0.40	0.66	0.36	0.41
GOTA Φ _T	PM	0.28	0.41	0.30	0.49	0.16	0.17
GOTA Φ _T	T+A	0.24	0.37	0.27	0.46	0.11	0.12
GOTA Φ _T	T	0.22	0.35	0.26	0.44	0.09	0.10

^aMethod used for the classification. RandFR and RandIC are baseline predictors, based on the distribution of GO terms in the training set
^bInformations used in prediction: PM = title, abstract, references and publication year (PubMed); T+A = title and abstract; T = title; N/A = no information
^cMetrics definitions are in the “Evaluation metrics” section. In top section of the table, for each metric, the best result is highlighted in italic

ISSN: 1471-2105