Skip to main content

Table 4 Correlation with human judgement obtained for various combinations of vector extraction variant, trimming threshold and corpus.

From: Calculating semantic relatedness for biomedical use in a knowledge-poor environment

Parameters residents-s residents-r 29p 29c 101c
Extr.; c; Corpus Max (N) Avg Max (N) Avg Max (N) Avg Max (N) Avg Max (N) Avg
T-GSP; c = 0.9; F 0.54 (38) 0.53 0.51 (27) 0.5 0.68 (57) 0.63 0.79 (31) 0.74 0.58 (39) 0.55
T-GSP; c = 0.8; F 0.54 (38) 0.53 0.51 (26) 0.5 0.68 (57) 0.62 0.77 (47) 0.73 0.58 (39) 0.55
T-GSP; c = 0.7; F 0.54 (58) 0.52 0.52 (38) 0.51 0.68 (56) 0.63 0.79 (32) 0.74 0.58 (40) 0.55
T-GSP; c = 0.6; F 0.52 (58) 0.51 0.5 (37) 0.49 0.66 (60) 0.63 0.78 (24) 0.74 0.59 (38) 0.56
T-GSP; c = 0.5; F 0.53 (38) 0.52 0.5 (37) 0.49 0.69 (47) 0.64 0.79 (33) 0.74 0.55 (29) 0.53
T-GSP; c = 0.4; F 0.52 (38) 0.51 0.5 (26) 0.49 0.65 (57) 0.6 0.75 (25) 0.7 0.56 (27) 0.52
T-GSP; c = 0.3; F 0.49 (41) 0.47 0.47 (55) 0.45 0.65 (57) 0.6 0.75 (29) 0.7 0.53 (31) 0.5
T-GSP; c = 0.2; F 0.47 (52) 0.44 0.44 (58) 0.4 0.64 (17) 0.58 0.77 (32) 0.71 0.51 (23) 0.48
T-GSP; c = 0.1; F 0.42 (37) 0.38 0.38 (57) 0.34 0.61 (25) 0.47 0.75 (26) 0.6 0.42 (34) 0.36
T-GSP; -; F 0.54 0.49 0.52 0.46 0.69 0.6 0.79 0.71 0.59 0.51
No T-GSP; c = 0.4; F 0.56 (34) 0.55 0.52 (26) 0.51 0.73 (47) 0.69 0.82 (32) 0.77 0.6 (42) 0.56
No T-GSP; c = 0.3; F 0.56 (38) 0.55 0.53 (38) 0.51 0.73 (47) 0.69 0.82 (32) 0.78 0.59 (42) 0.56
No T-GSP; c = 0.35; F 0.56 (34) 0.55 0.53 (26) 0.51 0.73 (47) 0.69 0.81 (31) 0.78 0.6 (42) 0.56
No T-GSP; c = 0.2; F 0.57 (58) 0.55 0.53 (38) 0.52 0.74 (60) 0.7 0.84 (32) 0.79 0.59 (39) 0.56
No T-GSP; c = 0.25; F 0.56 (58) 0.55 0.53 (38) 0.52 0.74 (54) 0.69 0.82 (33) 0.78 0.59 (42) 0.56
No T-GSP; c = 0.1; F 0.57 (58) 0.55 0.54 (38) 0.52 0.75 (59) 0.71 0.85 (31) 0.8 0.59 (36) 0.55
No T-GSP; c = 0.15; F 0.57 (58) 0.55 0.53 (38) 0.52 0.75 (54) 0.7 0.84 (32) 0.79 0.59 (43) 0.56
No T-GSP; c = 0.05; F 0.58 (58) 0.55 0.54 (58) 0.52 0.76 (59) 0.7 0.85 (33) 0.8 0.59 (39) 0.55
No T-GSP; -; F 0.58 0.55 0.54 0.52 0.76 0.7 0.85 0.79 0.6 0.56
-; -; F 0.58 0.52 0.54 0.49 0.76 0.64 0.85 0.75 0.6 0.53
T-GSP; c = 0.9; A 0.37 (53) 0.35 0.4 (59) 0.38 0.66 (26) 0.63 0.82 (55) 0.77 0.51 (42) 0.48
T-GSP; c = 0.8; A 0.37 (53) 0.35 0.4 (59) 0.38 0.66 (26) 0.63 0.82 (55) 0.77 0.51 (42) 0.48
T-GSP; c = 0.7; A 0.37 (53) 0.35 0.4 (59) 0.38 0.66 (24) 0.63 0.82 (53) 0.78 0.51 (42) 0.48
T-GSP; c = 0.6; A 0.36 (55) 0.35 0.39 (59) 0.37 0.66 (25) 0.63 0.82 (24) 0.77 0.52 (41) 0.49
T-GSP; c = 0.5; A 0.35 (56) 0.34 0.39 (59) 0.37 0.66 (24) 0.62 0.8 (24) 0.75 0.51 (41) 0.48
T-GSP; c = 0.4; A 0.35 (52) 0.34 0.38 (56) 0.36 0.67 (52) 0.63 0.8 (57) 0.75 0.46 (48) 0.43
T-GSP; c = 0.3; A 0.32 (57) 0.31 0.37 (58) 0.35 0.68 (52) 0.61 0.82 (56) 0.72 0.45 (51) 0.42
T-GSP; c = 0.2; A 0.32 (57) 0.31 0.35 (57) 0.33 0.6 (50) 0.46 0.74 (26) 0.62 0.46 (41) 0.39
T-GSP; c = 0.1; A 0.28 (57) 0.25 0.28 (60) 0.24 0.62 (49) 0.47 0.7 (48) 0.57 0.38 (52) 0.26
T-GSP; -; A 0.37 0.33 0.4 0.35 0.68 0.59 0.82 0.72 0.52 0.43
No T-GSP; c = 0.4; A 0.4 (48) 0.38 0.42 (52) 0.4 0.72 (50) 0.65 0.83 (50) 0.78 0.56 (36) 0.52
No T-GSP; c = 0.3; A 0.4 (52) 0.38 0.42 (52) 0.4 0.71 (50) 0.64 0.83 (56) 0.77 0.55 (36) 0.51
No T-GSP; c = 0.35; A 0.4 (52) 0.38 0.42 (53) 0.4 0.72 (50) 0.64 0.83 (50) 0.77 0.55 (36) 0.51
No T-GSP; c = 0.2; A 0.39 (53) 0.37 0.42 (53) 0.39 0.69 (50) 0.6 0.84 (32) 0.75 0.56 (42) 0.51
No T-GSP; c = 0.25; A 0.39 (52) 0.37 0.42 (53) 0.39 0.71 (49) 0.64 0.84 (31) 0.77 0.55 (42) 0.51
No T-GSP; c = 0.1; A 0.38 (57) 0.36 0.41 (60) 0.38 0.68 (60) 0.57 0.82 (58) 0.71 0.55 (53) 0.49
No T-GSP; c = 0.15; A 0.38 (53) 0.36 0.41 (53) 0.38 0.69 (60) 0.59 0.83 (32) 0.74 0.55 (43) 0.49
No T-GSP; c = 0.05; A 0.37 (34) 0.34 0.41 (34) 0.37 0.69 (60) 0.5 0.83 (58) 0.66 0.52 (59) 0.43
No T-GSP; -; A 0.4 0.37 0.42 0.39 0.72 0.6 0.84 0.74 0.56 0.49
-; -; A 0.4 0.35 0.42 0.37 0.72 0.6 0.84 0.73 0.56 0.46
  1. The table presents both average and best results obtained for each combination of parameters. For best results, information about N value (number of aggreagted documents) is also included. Parameter c is the trimming threshold, while F denotes 'Full corpus' and A stands for 'Abstracts only'. '-' denotes an aggregation (max/avg) over all parameter values.