Adaptable, high recall, event extraction system with minimal configuration
© Miwa and Ananiadou; licensee BioMed Central Ltd. 2015
Published: 23 June 2015
Biomedical event extraction has been a major focus of biomedical natural language processing (BioNLP) research since the first BioNLP shared task was held in 2009. Accordingly, a large number of event extraction systems have been developed. Most such systems, however, have been developed for specific tasks and/or incorporated task specific settings, making their application to new corpora and tasks problematic without modification of the systems themselves. There is thus a need for event extraction systems that can achieve high levels of accuracy when applied to corpora in new domains, without the need for exhaustive tuning or modification, whilst retaining competitive levels of performance.
We have enhanced our state-of-the-art event extraction system, EventMine, to alleviate the need for task-specific tuning. Task-specific details are specified in a configuration file, while extensive task-specific parameter tuning is avoided through the integration of a weighting method, a covariate shift method, and their combination. The task-specific configuration and weighting method have been employed within the context of two different sub-tasks of BioNLP shared task 2013, i.e. Cancer Genetics (CG) and Pathway Curation (PC), removing the need to modify the system specifically for each task. With minimal task specific configuration and tuning, EventMine achieved the 1st place in the PC task, and 2nd in the CG, achieving the highest recall for both tasks. The system has been further enhanced following the shared task by incorporating the covariate shift method and entity generalisations based on the task definitions, leading to further performance improvements.
We have shown that it is possible to apply a state-of-the-art event extraction system to new tasks with high levels of performance, without having to modify the system internally. Both covariate shift and weighting methods are useful in facilitating the production of high recall systems. These methods and their combination can adapt a model to the target data with no deep tuning and little manual configuration.
The use of these resources and applications has revealed that a wider range of event types and structures needs to be discovered by the event extraction systems than those addressed in the earlier BioNLP shared tasks, in order that the extracted events are able to cover as many as possible of the phenomena described in biomedical articles. The BioNLP Shared Task 2013 (BioNLP-ST 2013) introduced several tasks to address the problem. In particular, the CG and Pathway Curation (PC)  tasks defined a range of new entity and event types in the context of biomedical problems that had not previously been dealt with in the shared tasks. Given that both CG and PC tasks cover a greater number of bio-entity, role and event types than previous tasks, e.g., GENIA , event recognition becomes increasingly difficult since the systems need to extract correct event types and structures from a larger number of possible types and structures. Although both the representation and format of events are shared among many of the BioNLP subtasks, most event extraction systems participating in the BioNLP shared tasks have focussed only on a limited number of specific subtasks. This is largely due to the difficultly in applying a specific event extraction system to different tasks without carrying out considerable modifications to the system. It can thus be very costly and time-consuming to tune an event extraction system to deal with new event types. Accordingly, we have developed a novel method which allows an event extraction system to be applied to new tasks with competitive levels of performance, but without the onerous effort of development and tuning.
In this paper, we describe the integration of our novel method within our state-of-the-art event extraction system, EventMine , allowing it to be adapted to new tasks without internal modification and only minimal effort, without sacrificing performance . EventMine has been enhanced with configurability, that allows it to be applied to new tasks with minimum manual effort and adaptability, that allows it to retain competitive levels of performance. Adaptation of the system to new tasks requires only that changes are made to a configuration file that is used to specify the task-specific information. In order to allow the system to adapt itself to new task specifications and achieve consistent performance, without the need for exhaustive tuning of the (hyper-)parameters of machine learning algorithms in EventMine, we have integrated a weighting method, a covariate shift method [10, 11], and their combination into EventMine. The enhanced, adaptable version of EventMine has subsequently been applied to new tasks, i.e., the PC and CG tasks of BioNLP-ST 2013. The state-of-the-art performance achieved by EventMine on these tasks (1st and 2nd ranking, respectively) clearly demonstrates that the system can successfully be adapted to multiple new tasks through the specification of only minimal configuration information, and without the need for deep tuning.
In this section we firstly introduce EventMine, and then describe how it has been extended to facilitate more straightforward adaptation to new tasks. We conclude by describing how it has been applied for the BioNLP-ST subtasks of CG and PC.
EventMine  is an SVM-based pipeline event extraction system that has been applied to several biomedical event extraction tasks, and has achieved the top-ranked performance on several corpora [8, 12], in comparison to other systems. EventMine consists of four modules: a trigger/entity detector, an argument detector, a multi-argument detector and a hedge detector. The trigger/entity detector enumerates the triggers/entities in the training data, finds words that match the head words (in their surface forms, base forms using parsers, or stems) of the triggers/entities, and classifies each word into specific entity types (e.g., DNA_domain_or-region), event types (Regulation) or a negative type that denotes that the word does not participate in any events. For example, the word of in Figure 1 has a negative type. The argument detector enumerates all possible event-role pairs among triggers and arguments that match the semantic type combinations of the pairs in the training data, and classifies each pair into specific event role types (e.g., Binding:Theme-Gene_or-gene_product) or negative role types (e.g., Binding:NONE-Gene_or_gene_product). In Figure 1, there is no relation that holds between Overexpression and TGF-beta, so they have a negative role type. Here, an event role type consists of a trigger type and an argument type with its role type. Similarly, the multi-argument detector enumerates all possible combinations of pairs that match the semantic type structures of the events in the training data, and classifies each combination into an event structure type (e.g., Positive_regulation:Cause-Gene_or_gene_product:Theme-Phosphorylation) or a negative type. Here, an event structure type consists of a trigger type and argument types with their role types. The hedge detector attaches hedge attributes to the detected events by classifying the events into specific hedge types (Speculation and Negation) or a negative type.
All the classifications are performed by using one-vs-rest support vector machines (SVMs). The detectors use the types or type combinations mentioned above as their classification labels. Labels with scores higher than the scores of the separating hyper-plane of SVM and labels with the highest scores are selected as the predicted labels. Classification is treated as a multi-class, multi-label classification problem with the requirement that at least one label (including a negative type) is selected during the prediction process. Classification makes use of both lexical and syntactic features. These features consist of character n-grams, word n-grams, shortest paths among event participants within parse trees, and word n-grams and shortest paths between event participants and triggers/entities outside of the events within parse trees. We replace all gold (given) entity names with their types to avoid the models being tuned to specific entities. To reduce feature space cost, we compress the feature space to 220 by hashing . We assign greater weights to the positive instances to alleviate class imbalance and we normalise the feature vectors for each type (e.g., the word n-gram feature vector is normalised to a unit length) as well as for the entire vectors, and set the C parameter for SVM to 1.
EventMine generates training instances based only on predictions by the preceding modules in the pipeline, thus ensuring that training is not carried out on instances that cannot be detected by the preceding modules. If the generated instance corresponds to gold instances, then the semantic types assigned to the gold instances are assigned to the generated instance. Otherwise, a negative type is assigned to the generated instance. This mode of instance generation allows us to obtain similar distributions of training and test instances, as it is impossible to detect them if the participants are missed by the preceding modules.
Extension of EventMine
This section describes how EventMine has been enhanced to allow it to be applied to new tasks with minimum manual effort, whilst retaining good levels of performance. We firstly explain the incorporation of a configuration file that allows EventMine to be applied to new tasks without internal modification of the system. Subsequently, we introduce three methods which, based on the information provided in the configuration file, allow EventMine to adapt itself to carry out new extraction tasks without the need for parameter tuning.
The TEES-2.1 system  has a similar motivation to ours regarding ease of adaptation to a range of different tasks, and it has been applied to several event extraction tasks in the BioNLP-ST 2013. Both TEES-2.1 and EventMine are pipeline-based systems and extract labels required for classification from the training data. However, they vary in terms of their approaches to implementing adaptability. TEES-2.1 does not require user-provided configuration information and applies an automated, but time-consuming, hyper-parameter tuning method that uses the development data set. In contrast, EventMine takes user-provided configuration information and employs three methods that remove the requirement for parameter tuning, as explained in the following section.
Configurability of EventMine
The configuration file also includes two types of generalisations: one for labels and features ("Label and feature generalisations" in Figure 2) and one for instance generation ("Instance generation generalisations" in Figure 2). These generalisations are used in both training and prediction phases since they should be performed in similar situations.
Label and feature generalisations reduce the number of event role types and event structure types that are used as classification labels, and the number of features used by all detectors. The event role types and event structure types are combinations of types of triggers and participants with their roles, as explained in the EventMine section. The generalisations help to reduce the computational and space costs of both training and prediction since these are dependent on the number of the classification labels. The generalisations are indispensable for the two tasks in the BioNLP-ST 2013, since the tasks cover a greater number of bio-entity, role and event types than previous shared tasks, meaning that there are thousands of potential event structure labels. Considering all of these possible labels without carrying out generalisations would create an intractable problem for EventMine. Although the effects of the generalisations on event extraction performance cannot be evaluated on the tasks since it is infeasible to run EventMine without them, the generalisations have both advantages and disadvantages: the generalisations may alleviate the data sparseness problem during training, but they may also induce over-generalised features when they are applied to the tasks with enough training instances. The generalisations are applied to event role and event structure labels, since the types in these labels include types that are predicted by other detectors. For example, an event role label Positive_regulation:Theme-Phosphorylation contains Positive_regulation and Phosphorylation, which are predicted by the trigger/entity detector. Label and feature generalisations are possible in the following three cases: firstly, trigger/entity types are predicted by the trigger/entity detector, so their prediction is not required in the argument and multi-argument detectors. Secondly, the role types are predicted by the argument detector, so their prediction is not required in the multi-argument detector. Thirdly, the numbered role types, e.g., Theme, Theme2, are predicted in the multi-argument detector, so their prediction is not required in the argument detector. The numbered role types are required in events since the numbers indicate the correspondence between roles. For example, if an event has two Themes and the second Theme has a corresponding Instrument, their roles will be Theme2 and Instrument2 to differentiate from the first Theme and Instrument. It is difficult to predict argument numbers without knowing the other arguments involved in the event, so the numbers are predicted in the multi-argument detector. These generalisations are also applied to the generation of the features used by all detectors. For example, generalisations of gold entities can be used as the basis of generating features used by the argument/trigger detector.
Instance generation generalisations are used to expand the possible event role types and event structure types to create instances in prediction. The instance generation generalisations may introduce noisy instances but they may also generate instances of event structures that otherwise would not have been considered, due to lack of evidence in the training data. For example, even if there are no Positive_regulation:Theme-Gene_expression instances in the training data, such instances are also created in prediction when there are Regulation:Theme-Gene_expression instances in the training data and there is a rule in the configuration file specifying that Positive_regulation and Regulation event types should share the event structures. The rules for the instance generation generalisations are applied whenever instances are created for prediction. The instance generation generalisations are included separately from the label and feature generalisations since the latter may introduce illegal or unrealistic event structures. For example, if we specify to share the event structures of Phosphorylation and DNA_methylation in instance generation in CG and transfer the event structures of Phosphorylation to DNA_methylation, DNA_methylation with Molecule as Theme will be illegally created (DNA_methylation takes only Gene_or_gene_product as Theme in the task definition.)
In addition to the task specific settings, the configuration file is designed to specify other options, e.g., parsers, domain adaptation methods, dictionaries, etc. Although we acknowledge that the achievement of high levels of performance on a specific task is largely dependent on determining the most appropriate combination of various methods and resources such as the above, our aim here is to demonstrate the configurability and adaptability of EventMine, rather than trying to achieve the highest possible performance for the tasks considered. Accordingly, the settings for the configuration options introduced above are the same as those used in our previous application of EventMine to the EPIgenetics and post translational modifications (EPI) task, as described in , unless otherwise noted. Specifically, we employ both a deep syntactic parser, Enju  and a dependency parser, GDep . We utilise liblinear-java  with the L2-regularised L2-loss linear SVM setting for the SVM implementation, MurmurHash2  for hashing, and Snowball  for stemming. We use no external resources (e.g., dictionaries) or tools (e.g., a coreference resolver) except for when we use external corpora to create stacked models for the PC task, as explained later.
Adaptability of EventMine
Although the above-described configuration file allows EventMine to be straightforwardly configured to new tasks, this does not in itself guarantee that the performance of the system on such new tasks will be of an acceptable quality. In other words, we need to ensure that EventMine is adaptable to new tasks. In all four modules of the event extraction pipeline, EventMine needs to solve classification problems. Some of the issues relating to the 1-vs-rest classification method employed are dependent of the settings of the hyper-parameters, which should be tuned to allow the classifiers to work to their full potential. However, it is costly and time-consuming to search for the best setting from the many possible hyper-parameter combinations. There is no general, efficient method to automatically tune the parameters within in a pipeline setting and it is also unrealistic to assume that the hyper-parameters can be effectively tuned for new tasks without exhaustive searching and knowledge of how the system works.
Here, p train (x i ) and p target (x i ) are the outputs of the logistic regression classifier, and n train and n target are the numbers of training and target instances. This puts weights l(x i ) on all the training instances according to their likelihood of appearing in the target data set. l(x i ) represents the test-to-training ratio p(x|θ)/p(x|λ), where λ denotes the training distribution and θ denotes the test distribution, and it is known that the loss on the test distribution can be minimised by weighting the loss on the training distribution with the ratio . In contrast to the originally proposed method in , our novel method introduces Ccs that keeps the balance between the regularisation and loss terms, since the l(x i ) can suffer from overfitting of the training and target instances and the imbalance of the numbers of training and target instances. If we set all the weights l(X i ) to 1, this is equal to the objective function of SVM. This objective function tries to make the distribution of the instances close to one for the target data set. This means that it encourages the classifiers to learn more about instances that seem to appear in the target data set.
If we set all the weights l(+) to 1, this is equal to Equation 1.
The weighting and covariate shift methods are incorporated into EventMine by applying them to all the classifications in the four modules of the system. The hyper-parameter C is kept to 1 for all the experiments, as mentioned in the previous section. These methods may not necessarily produce the same levels performance that could be achieved by parameter tuning through exhaustive search. However, given the costly nature of such parameter tuning, as described above, our method makes the problem of adapting EventMine to new tasks much more feasible, whilst still allowing good levels of performance to be achieved. Since it is difficult to carry out exhaustive parameter tuning, it is not possible for us to compare the results of our novel methods with those that could be achieved through such tuning. Instead, we show in our experiments that incorporation of the new methods within EventMine can improve the performance of the system on both the PC and CG tasks of the BioNLP-ST 2013, to a level that is competitive with other systems that participated in these tasks.
Configuration of EventMine for BioNLP-ST 2013 tasks
In the following sections, we describe EventMine's configuration for the CG and PC tasks, based on the notions of configurability and adaptability.
Configuration for the CG task
The Cancer Genetics (CG) task  aims to extract information from bio-processes related to the development and progression of cancer. The annotations in the training data were based on the Multi-Level Event Extraction (MLEE) corpus .
The configuration for our shared task submission used several label and feature generalisations, which are shown in Figure 2. For the event role types, generalisations for the trigger types, role types and argument types were applied as follows. In terms of the trigger types, we generalised the three regulation types, i.e., Positive_regulation, Regulation and Negative_regulation into a single REGULATION type, and post-transcriptional modification (PTM) types (e.g., Acetylation, Phosphorylation) into a single PTM type. In terms of role types, numbered role types were generalised as non-numbered role type (e.g., Participant2→Participant). In terms of argument types, event types were generalised as a single EVENT type and entity types were generalised as a single ENTITY type. These generalisations, except for the entity generalisations, are the combination of the generalisations used in the GENIA, EPI, and Infectious Diseases (ID)  annotated corpora of the BioNLP-ST 2011 . For the event structure types, the same generalisations are applied, except for numbered role types, which are retained, since these are important in differentiating different types of event structures. Unlike other types, the numbered role types in events are not predicted by any other modules than the multi-argument detector as we explained in the Configurability of EventMine section.
Further experiments carried out after the shared task involved a more fine-grained classification of entities into three general types defined in the hierarchy of entity types defined for the CG task, i.e., anatomical, pathological, and molecular, instead of using a single ENTITY type, as in our shared task submission. In terms of instance generation generalisations, we applied them only to the regulation event types, to avoid introducing unexpected event structures.
Configuration for the PC task
The Pathway Curation (PC) task  aims to support the curation of bio-molecular pathway models, with the training texts selected to cover both signalling and metabolic pathways.
For our shared task submission for this task, we incorporated a stacking method , by training our models, using the same configuration as described above, on seven other available corpora: GENIA, EPI, ID, DNA methylation , Exhaustive PTM , mTOR  and CG. The stacking method uses the prediction scores of all the models as additional features in the detectors. Although some of these corpora may not be directly related to the PC task and the models trained on them can produce noisy features, we have used all the corpora, since stacking has been shown to improve performance [12, 20]. Also in common with our work on the CG task, we have carried out further experiments after the shared task. The stacking method was not employed in this latter set of experiments, since our aim was to focus on the three methods introduced in the section on Adaptability of EventMine.
We employ the same type of generalisations as in the CG task described in the previous section, except for entity types. For our shared task submission, entity types were generalised to a single ENTITY type, similarly to our submission for the CG task. For our experiments that followed the shared task, a different type of entity generalisations to the one performed for the CG task was carried out, according to the different entity type definition for this task. The only type of entity generalisation we performed in the context of the PC task was to collapse the Gene_or_gene_product and Complex types into a single PROTEIN type. The other two entity types used in the corpus, i.e., Simple chemical and Cellular component, retained their original labels. The generalisation is based on the reference resources of the entity type definition .
Results and Discussion
We have evaluated EventMine using the various configurations introduced in the previous sections. We firstly evaluated the system using settings employed for our shared task submission, which incorporated the use of the configuration file and the weighting method, but not the covariate shift method and task-specific entity generalisations. We compare our official results with those achieved by the best system that participated in the PC and CG tasks apart from EventMine, i.e., TEES-2.1 . This evaluation is also presented in . Subsequently, we evaluated the differences in performance that were obtained through the integration of the weighting method, the covariate shift method and their combination, together with the refined entity generalisation settings.
Evaluations on instance generation generalisations and stacking
Effect of instance generation generalisations and the stacking method on the PC development data set.
Official scores for the shared task
Official best and second best scores on the CG and PC tasks.
Recall / Precision / F-scores for event categories on the CG and PC tasks
Although EventMine did not achieve the best overall results in the CG task, we still consider that the performance level achieved is promising, given that we did not incorporate any external resources, and we did not carry out any tuning of parameters (e.g., C in SVM). A detailed comparison with TEES-2.1 shows that TEES-2.1 outperformed EventMine in the recognition of anatomical and pathological event categories, which constitute event types that have not been addressed in previous shared tasks. This indicates EventMine missed some of the novel structures introduced in these new event types. However, EventMine performed better than TEES-2.1 in the recognition of some other types of events involved in the task. The performance range of EventMine in recognising the various event types covered by the CG task is similar to the scores achieved by the system when applied to the MLEE corpus (52.34-53.43% F-Score ) although we cannot directly compare the results since the corpora are not completely same and the test sets are different. The ranges of the scores are around 60% to 70% F-score for non-nested events (e.g., SIMPLE), 40% for nested events (e.g., REGULATION) and 30% for modifications (e.g., MOD). This large range of scores may be caused by a cumulative combination of errors in predicting triggers, participants and modifications, since a similar spread of accuracy has been observed for previous tasks (e.g., GENIA, EPI, and ID results in ). Also, previous tasks like GENIA provided more training instances per type than the CG task, but the ranges of scores are broadly similar. These results indicate that further improvements to the performance of the system may require more than a simple increase in the training instances. EventMine performed particularly well on the PC task which is an encouraging result in demonstrating the adaptability of the enhanced system, since it was a completely novel task for the system. The recall achieved by our system was considerably higher than that obtained by TEES-2.1.
Evaluations on the weighting and covariate shift methods
Our next set of experiments evaluated the weighting method, the covariate shift method, and their combination, explained in section on Adaptability of EventMine (see. Equations 1-3.) As a baseline system comparison, we used the version of EventMine that did not incorporate these methods.
As explained in the description of the task-specific configurations of the system above, this evaluation differed from the other evaluations in two ways. Firstly, stacking was not employed for the PC task. Secondly, we used refined entity type generalisations, i.e., detailed entity type generalisations based on the individual task definitions were employed in the task settings.
Effect of the weighting and covariate shift methods on the development data sets.
Effect of the weighting (W) and covariate shift (CS) methods on the test data sets.
-W -CS F-Score
+W +CS F-Score (%)
In this paper, we have described the development of an adaptable event extraction system, which accepts task-specific information in the form of a configuration file, and employs methods that alleviate the need to carry out extensive tuning of the system to allow it to be applied to new data sets. The new system has been created by enhancing an existing state-of-the-art event extraction system, EventMine. The configuration file is used to specify the definitions of types (e.g., entity and event types) and generalisations over these types that are used to adapt the system to new tasks. The provision of this configuration information alleviates the need to carry out task-specific modification of the system. Furthermore, to avoid the costly process of extensive parameter tuning to make the system suitable for application to new data sets, three adaptation methods are employed, i.e., a weighting method, a covariate shift method and their combination. The weighting method aims to alleviate class imbalance, while the covariate shift method aims to automatically adjust the differences in the distributions of instances in the training and target data sets. The enhanced system was applied to the CG and PC tasks with minimal task specific configuration. In the context of the BioNLP-ST 2013, only the weighting method was employed to facilitate the adaptation of the system to the specific tasks. This version of the system achieved the second best performance in the CG task and the best performance in the PC task. Following the shared task, we were able to further improve the results, though the incorporation of the covariate shift method, combined with task-based generalisation of entity types, the latter of which preserves some semantic information about entities that was lost in the more extreme generalisation of the entities used in the shared task version of the system.
The positive results obtained through the integration of our novel methods demonstrate that the enhanced version of EventMine can be effectively adapted to new tasks, without the need to make changes to the system itself. The weighting method, covariate shift method, and their combination have all been demonstrated to be useful in facilitating automatic tuning of the system to the new tasks. The success of applying the covariate shift method to the shared task data underlines its potential importance in future event extraction research, as a means to resolve the differences between the training and target data sets, which is a vital step to support the development of accurate and practical applications. Based on these results, our future work will involve investigating the feasibility of applying the covariate shift method to larger data sets e.g., PubMed, as a basis for the development of more practical applications. In this scenario, however, and in contrast to the training and test data sets of the shared tasks, there remain several issues to be resolved. These issues include the much larger differences in distributions between the training and target data sets, the vast number of target documents and the lack of standard evaluation criteria. Moreover, unlike in the shared tasks, the named entities are not given and their detection exhibits similar problems to the above. Whilst these problems are all challenging, we believe that it is vital for them to be addressed, in order to facilitate a significant improvement in the adaptability of event extraction systems to new tasks, and to allow their use in new practical applications.
We thank Paul Thompson for his useful comments.
The publication costs of this article were funded by the University of Manchester. This work was supported by the Medical Research Council [grant number MR/L01078X/1], the Arts and Humanities Research Council (AHRC) [grant number AH/L00982X/1] and the JSPS Grant-in-Aid for Young Scientists (B) [grant number 25730129].
This article has been published as part of BMC Bioinformatics Volume 16 Supplement 10, 2015: BioNLP Shared Task 2013: Part 1. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/16/S10.
- Pyysalo S, Ohta T, Rak R, Rowley A, Chun HW, Jung SJ, Choi SP, Tsujii J, Ananiadou S: Overview of the Cancer Genetics and Pathway Curation tasks of BioNLP Shared Task 2013. BMC Bioinformatics.Google Scholar
- Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J: Extracting Bio-Molecular Events from Literature -- the BioNLP'09 Shared Task. Computational Intelligence. 2011, 27 (4): 513-540. 10.1111/j.1467-8640.2011.00398.x.View ArticleGoogle Scholar
- Kim JD, Nguyen N, Wang Y, Tsujii J, Takagi T, Yonezawa A: The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011. BMC Bioinformatics. 2012, 13 (Suppl 11): S1-10.1186/1471-2105-13-S11-S1.PubMed CentralView ArticlePubMedGoogle Scholar
- Pyysalo S, Ohta T, Rak R, Sullivan D, Mao C, Wang C, Sobral B, Tsujii J, Ananiadou S: Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011. BMC Bioinformatics. 2012, 13 (Suppl 11): S2-10.1186/1471-2105-13-S11-S2.PubMed CentralView ArticlePubMedGoogle Scholar
- Bjorne J, Van Landeghem S, Pyysalo S, Ohta T, Ginter F, Van de Peer Y, Ananiadou S, Salakoski T: PubMed-Scale Event Extraction for Post-Translational Modifications, Epigenetics and Protein Structural Relations. BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing. 2012, Montreal, Canada: Association for Computational Linguistics, 82-90.Google Scholar
- Van Landeghem S, Bjorne J, Wei CH, Hakala K, Pyysalo S, Ananiadou S, Kao HY, Lu Z, Salakoski T, Van de Peer Y, Ginter F: Large-scale event extraction from literature with multi-level gene normalization. PLoS One. 2013, 8 (4): e55814-10.1371/journal.pone.0055814.PubMed CentralView ArticlePubMedGoogle Scholar
- Miwa M, Ohta T, Rak R, Rowley A, Kell DB, Pyysalo S, Ananiadou S: A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text. Bioinformatics. 2013, 29 (13): i44-i52. 10.1093/bioinformatics/btt227.PubMed CentralView ArticlePubMedGoogle Scholar
- Miwa M, Thompson P, Ananiadou S: Boosting automatic event extraction from the literature using domain adaptation and coreference resolution. Bioinformatics. 2012, 28 (13): 1759-1765. 10.1093/bioinformatics/bts237.PubMed CentralView ArticlePubMedGoogle Scholar
- Miwa M, Ananiadou S: NaCTeM EventMine for BioNLP 2013 CG and PC tasks. Proceedings of the BioNLP Shared Task 2013 Workshop. 2013, Sofia, Bulgaria: Association for Computational Linguistics, 94-98.Google Scholar
- Shimodaira H: Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference. 2000, 90 (2): 227-244. 10.1016/S0378-3758(00)00115-4.View ArticleGoogle Scholar
- Bickel S, Scheffer T: Discriminative Learning Under Covariate Shift. Journal of Machine Learning Research. 2009, 10: 2137-2155.Google Scholar
- Miwa M, Pyysalo S, Ohta T, Ananiadou S: Wide coverage biomedical event extraction using multiple partially overlapping corpora. BMC Bioinformatics. 2013, 14: 175-10.1186/1471-2105-14-175.PubMed CentralView ArticlePubMedGoogle Scholar
- Shi Q, Petterson J, Dror G, Langford J, Strehl AL, Smola AJ, Vishwanathan S: Hash kernels. International Conference on Artificial Intelligence and, Statistics. 2009, 496-503.Google Scholar
- Björne J, Salakoski T: TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared Task. Proceedings of BioNLP Shared Task 2013 Workshop. 2013, Sofia, Bulgaria: Association for Computational LinguisticsGoogle Scholar
- Miyao Y, Tsujii J: Feature forest models for probabilistic HPSG parsing. Computational Linguistics. 2008, 34: 35-80. 10.1162/coli.2008.34.1.35.View ArticleGoogle Scholar
- Sagae K, Tsujii J: Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles. Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007. 2007, Prague, Czech Republic: Association for Computational Linguistics, 1044-1050.Google Scholar
- Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ: LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research. 2008, 9: 1871-1874.Google Scholar
- murmurhash. [https://sites.google.com/site/murmurhash/]
- Snowball. [http://snowball.tartarus.org/]
- Pyysalo S, Ohta T, Miwa M, Cho HC, Tsujii J, Ananiadou S: Event extraction across multiple levels of biological organization. Bioinformatics. 2012, 28 (18): i575-i581. 10.1093/bioinformatics/bts407.PubMed CentralView ArticlePubMedGoogle Scholar
- Wolpert DH: Stacked generalization. Neural networks. 1992, 5 (2): 241-259. 10.1016/S0893-6080(05)80023-1.View ArticleGoogle Scholar
- Ohta T, Pyysalo S, Miwa M, Tsujii J: Event extraction for DNA methylation. Journal of Biomedical Semantics. 2011, 2 (Suppl 5): S2-10.1186/2041-1480-2-S5-S2.PubMed CentralView ArticlePubMedGoogle Scholar
- Pyysalo S, Ohta T, Miwa M, Tsujii J: Towards Exhaustive Event Extraction for Protein Modifications. Proceedings of the BioNLP Shared Task 2011 Workshop. 2011, Portland, Oregon, USA: Association for Computational Linguistics, 114-123.Google Scholar
- Ohta T, Pyysalo S, Tsujii J: From Pathways to Biomolecular Events: Opportunities and Challenges. Proceedings of the BioNLP Shared Task 2011 Workshop. 2011, Portland, Oregon, USA: Association for Computational Linguistics, 105-113.Google Scholar
- Noreen EW: Computer-Intensive Methods for Testing Hypotheses : An Introduction. 1989, Wiley-InterscienceGoogle Scholar
- Liu H, Verspoor K, Comeau DC, MacKinlay A, Wilbur WJ: Generalizing an Approximate Subgraph Matching-based System to Extract Events in Molecular Biology and Cancer Genetics. Proceedings of the BioNLP Shared Task 2013 Workshop. 2013, Sofia, Bulgaria: Association for Computational Linguistics, 76-85.Google Scholar
- Ramanan S, Senthil Nathan P: Performance and limitations of the linguistically motivated Cocoa/Peaberry system in a broad biological domain. Proceedings of the BioNLP Shared Task 2013 Workshop. 2013, Sofia, Bulgaria: Association for Computational Linguistics, 86-93.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.