Combining joint models for biomedical event extraction

BMC Bioinformatics

Table 1 Overall results

Results on test sets of all tasks we submitted to, for three models. We list recall, precision, and F₁ using the standard BioNLP approximate recursive metric. For the GE and ID datasets, the Stanford model used all four decoders with the reranker. For EPI, the Stanford model used only the 1N decoder with the reranker. In all three domains, the stacked UMass←Stanford model (FAUST) used all four decoders from the Stanford model as inputs. The "FAUST (without novel)" is created by removing all events which don't occur in either the UMass or Stanford models (i.e., events which are novel to the stacked output).

ISSN: 1471-2105