From: Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research

Global and local context kernels to represent a gene-disease association. a) The sentence extracted form a MEDLINE abstract (PMID:22337703) expresses the association between the disease MMD (Major Depressive Disorder) and the genes EHD3 and FREM3. We will focus in the association between EHD3 and MMD to illustrate the features considered in each kernel. b and c) The local context kernel (K LC ) uses orthographic and shallow linguistic features (POS, lemma, stem) of the tokens located at the left and right (window size of 2) of the candidate entities (EHD3 and MDD). d) The global context kernel (K GC ) is based on the assumption that an association between two entities (in this case EHD3 and MDD) is more likely to be expressed within on of three patterns (fore-between, between, between-after). In this example the association between EHD3 and MDD is expressed in the between pattern. e) In the global context kernel (K GC ) we consider both trigrams and sparse bigrams in each pattern.

