From: Improving a gold standard: treating human relevance judgments of MEDLINE document pairs
Judge | M 4 vs M 5 | ||
---|---|---|---|
M 4 | M 5 | = | |
0 | 1992 | 3008* | 0 |
1 | 2546 | 2454* | 0 |
2 | 2864* | 2136 | 0 |
3 | 2598 | 2402* | 0 |
4 | 2148 | 2851* | 1 |
5 | 2247 | 2753* | 0 |
6 | 2527 | 2473* | 0 |
7 | 3392* | 1608 | 0 |
8 | 3798* | 1202 | 0 |
9 | 2676 | 2324* | 0 |
10 | 2802* | 2198 | 0 |
11 | 2084 | 2916* | 0 |
12 | 2938* | 2062 | 0 |
Total | 34612 | 30387 | 1 |