Skip to main content

Table 3 Log of Probability Measures for test set optimized single parameters. The best performance in each row is marked with an asterisk.

From: Improving a gold standard: treating human relevance judgments of MEDLINE document pairs

Judge

M 1

M 4

M 5

0

-8902

-8425

-7805*

1

-7087

-6925

-6833*

2

-6872

-6641*

-6674

3

-6760

-6675

-6462*

4

-7694*

-8068

-7703

5

-7121

-7259

-6942*

6

-7032

-7015

-6913*

7

-7094

-6482*

-6836

8

-7352

-6487*

-7199

9

-7113

-6992

-6874*

10

-8041

-7576

-7442*

11

-7275

-7450

-6909*

12

-8160

-7694*

-7784

Ave

-7423

-7207

-7106