BMC Bioinformatics

Table 5 In order to measure which method best predicts the individual class values made by a test judge between two methods, we apply the signed rank test. We also count query document pairs where the predicted probability of the class value is bigger for each method (and also ties). An asterisk marks th better result when the difference has a p-value less than 0.05 by the signed rank test. The optimal parameters are the single parameter optimizations of Table 1.

From: Improving a gold standard: treating human relevance judgments of MEDLINE document pairs

Judge	M ₄ vs M ₅
	M ₄	M ₅	=
0	1992	3008*	0
1	2546	2454*	0
2	2864*	2136	0
3	2598	2402*	0
4	2148	2851*	1
5	2247	2753*	0
6	2527	2473*	0
7	3392*	1608	0
8	3798*	1202	0
9	2676	2324*	0
10	2802*	2198	0
11	2084	2916*	0
12	2938*	2062	0
Total	34612	30387	1

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com