Skip to main content

Table 2 Statistics by the most frequent dependency and overlapped POS labels, sentence length (i.e. number of words in the sentence) and relative dependency distances ij from a dependent wi to its head wj

From: From POS tagging to dependency parsing for biomedical event extraction

Dependency labels         
GENIA CRAFT POS tags Length Distance
Type % Type % Type % G % C Type % Type % G % C
advmod 2.3 ADV 4.0 CC 3.6 3.2 GENIA   <−5 4.1 3.9
amod 9.6 AMOD 1.9 CD 1.6 4.0 1-10 3.5 −5 1.2 1.2
appos 1.2 CONJ 3.6 DT 7.6 6.6 11-20 31.0 −4 2.1 2.1
aux 1.4 COORD 3.2 IN 12.9 11.3 21-30 35.7 −3 4.4 3.2
auxpass 1.5 DEP 1.0 JJ 10.1 7.6 31-40 19.4 −2 10.6 8.5
cc 3.5 LOC 1.7 NN 29.3 24.2 41-50 7.1 −1 24.1 21.7
conj 3.9 NMOD 33.7 NNS 6.9 6.6 >50 3.3 1 19.0 26.5
dep 2.1 OBJ 2.8 RB 2.5 2.4    2 9.4 9.8
det 7.2 P 18.4 TO 1.6 0.6 CRAFT   3 6.3 5.9
dobj 3.1 PMOD 10.6 VB 1.1 1.1 1-10 17.8 4 4.0 3.4
mark 1.1 PRD 0.9 VBD 2.1 2.2 11-20 23.1 5 2.4 2.3
nn 11.6 PRN 1.9 VBG 1.0 1.1 21-30 25.2 >5 12.3 11.6
nsubj 4.1 ROOT 3.9 VBN 3.1 3.8 31-40 17.5 - - -
nsubjpass 1.4 SBJ 4.9 VBP 1.4 1.1 41-50 9.3 - - -
num 1.2 SUB 0.9 VBZ 1.9 1.4 >50 7.1 - - -
pobj 12.2 TMP 0.9 - - - - - - - -
prep 12.3 VC 2.4 - - - - - - - -
punct 10.4 - - - - - - - - - -
root 3.8 - - - - - - - - - -
  1. In addition, % G and % C denote the occurrence proportions in GENIA and CRAFT, respectively