A detailed error analysis of 13 kernel methods for protein-protein interaction extraction

BMC Bioinformatics

Table 2 The distribution of pairs for each corpus according to classification success level using cross-learning setting

	AIMed					BioInfer					HPRD50					IEPA					LLL
	Total	T	F	T, %	F, %	Total	T	F	T, %	F, %	Total	T	F	T, %	F, %	Total	T	F	T, %	F, %	Total	T	F	T, %	F, %
0	41	0	41	0.0%	0.8%	319	319	0	12.6%	0.0%	1	0	1	0.0%	0.4%	9	9	0	2.7%	0.0%	3	3	0	1.8%	0.0%
1	73	6	67	0.6%	1.4%	362	362	0	14.3%	0.0%	4	2	2	1.2%	0.7%	19	17	2	5.1%	0.4%	5	4	1	2.4%	0.6%
2	199	26	173	2.6%	3.6%	322	312	10	12.3%	0.1%	7	3	4	1.8%	1.5%	33	32	1	9.6%	0.2%	10	9	1	5.5%	0.6%
3	315	39	276	3.9%	5.7%	303	280	23	11.0%	0.3%	23	10	13	6.1%	4.8%	38	36	2	10.7%	0.4%	19	19	0	11.6%	0.0%
4	489	71	418	7.1%	8.6%	321	260	61	10.3%	0.9%	27	15	12	9.2%	4.4%	48	45	3	13.4%	0.6%	25	25	0	15.2%	0.0%
5	606	84	522	8.4%	10.8%	355	239	116	9.4%	1.6%	27	15	12	9.2%	4.4%	44	32	12	9.6%	2.5%	25	20	5	12.2%	3.0%
6	547	94	453	9.4%	9.4%	400	208	192	8.2%	2.7%	41	22	19	13.5%	7.0%	51	34	17	10.1%	3.5%	26	18	8	11.0%	4.8%
7	725	136	589	13.6%	12.2%	432	190	242	7.5%	3.4%	43	18	25	11.0%	9.3%	63	32	31	9.6%	6.4%	20	7	13	4.3%	7.8%
8	721	132	589	13.2%	12.2%	586	146	440	5.8%	6.2%	52	17	35	10.4%	13.0%	69	35	34	10.4%	7.1%	34	18	16	11.0%	9.6%
9	767	110	657	11.0%	13.6%	737	95	642	3.7%	9.0%	61	18	43	11.0%	15.9%	107	36	71	10.7%	14.7%	34	19	15	11.6%	9.0%
10	574	118	456	11.8%	9.4%	1060	79	981	3.1%	13.8%	50	14	36	8.6%	13.3%	110	13	97	3.9%	20.1%	56	8	48	4.9%	28.9%
11	414	69	345	6.9%	7.1%	1906	29	1877	1.1%	26.3%	52	16	36	9.8%	13.3%	131	6	125	1.8%	25.9%	50	12	38	7.3%	22.9%
12	363	115	248	11.5%	5.1%	2563	15	2548	0.6%	35.7%	45	13	32	8.0%	11.9%	95	8	87	2.4%	18.0%	23	2	21	1.2%	12.7%

The distribution of pairs (total, positive and negative) in terms of the number of kernels that classify them correctly. Results shown for each corpus separately. Aggregated results are shown in Figure 2. All but the PT kernel are considered. (PT is extremely slow and provide below average results).

ISSN: 1471-2105