Skip to main content

Table 5 Features Used for Ranking

From: A graph-search framework for associating gene identifiers with documents

TestFile hasTerm → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbsfaujabbwgaLjabbkhaYjabb2gaTbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37DC@

EntityType h a s S p a n → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaGqaaiab=HgaOjab=fgaHjab=nhaZjab=nfatjab=bhaWjab=fgaHjab=5gaUbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37CB@

String h a s P o s s i b l e P r o t e i n − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaGqaaiab=HgaOjab=fgaHjab=nhaZjab=bfaqjab=9gaVjab=nhaZjab=nhaZjab=LgaPjab=jgaIjab=XgaSjab=vgaLjab=bfaqjab=jhaYjab=9gaVjab=rha0jab=vgaLjab=LgaPjab=5gaUnaaCaaaleqabaaccaGae4NeI0Iae8xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@489A@

GeneSyn synonym − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbohaZjabbMha5jabb6gaUjabb+gaVjabb6gaUjabbMha5jabb2gaTnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@3A6F@

Labels hasPossibleProtein → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbcfaqjabb+gaVjabbohaZjabbohaZjabbMgaPjabbkgaIjabbYgaSjabbwgaLjabbcfaqjabbkhaYjabb+gaVjabbsha0jabbwgaLjabbMgaPjabb6gaUbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@46BD@

String hasTerm → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbsfaujabbwgaLjabbkhaYjabb2gaTbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37DC@

TestFile possibleProtNER → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbchaWjabb+gaVjabbohaZjabbohaZjabbMgaPjabbkgaIjabbYgaSjabbwgaLjabbcfaqjabbkhaYjabb+gaVjabbsha0jabb6eaojabbweafjabbkfasbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@4242@

GeneId hasGene − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbEeahjabbwgaLjabb6gaUjabbwgaLnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@39AB@

TERM inFile → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbMgaPjabb6gaUjabbAeagjabbMgaPjabbYgaSjabbwgaLbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@365B@

TrainFile hasGene → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbEeahjabbwgaLjabb6gaUjabbwgaLbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37AA@

Labels annotates → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbggaHjabb6gaUjabb6gaUjabb+gaVjabbsha0jabbggaHjabbsha0jabbwgaLjabbohaZbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@3ADC@

Labels hasLikelyProtein → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbYeamjabbMgaPjabbUgaRjabbwgaLjabbYgaSjabbMha5jabbcfaqjabbkhaYjabb+gaVjabbsha0jabbwgaLjabbMgaPjabb6gaUbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@4401@

TrainFile annotates − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbggaHjabb6gaUjabb6gaUjabb+gaVjabbsha0jabbggaHjabbsha0jabbwgaLjabbohaZnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@3CDD@

String hasLikelyProtein − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbYeamjabbMgaPjabbUgaRjabbwgaLjabbYgaSjabbMha5jabbcfaqjabbkhaYjabb+gaVjabbsha0jabbwgaLjabbMgaPjabb6gaUnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@4602@

TestFile likelyProtNER → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbYgaSjabbMgaPjabbUgaRjabbwgaLjabbYgaSjabbMha5jabbcfaqjabbkhaYjabb+gaVjabbsha0jabb6eaojabbweafjabbkfasbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@3F86@

TestFile annotates − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbggaHjabb6gaUjabb6gaUjabb+gaVjabbsha0jabbggaHjabbsha0jabbwgaLjabbohaZnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@3CDD@

String likelyProt2syn → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbYgaSjabbMgaPjabbUgaRjabbwgaLjabbYgaSjabbMha5jabbcfaqjabbkhaYjabb+gaVjabbsha0jabbkdaYiabbohaZjabbMha5jabb6gaUbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@415B@

String possibleProt2syn → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbchaWjabb+gaVjabbohaZjabbohaZjabbMgaPjabbkgaIjabbYgaSjabbwgaLjabbcfaqjabbkhaYjabb+gaVjabbsha0jabbkdaYiabbohaZjabbMha5jabb6gaUbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@4417@

GeneSyn hasTerm → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbsfaujabbwgaLjabbkhaYjabb2gaTbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37DC@

TrainFile hasTerm → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbsfaujabbwgaLjabbkhaYjabb2gaTbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@37DC@

GeneId synonym → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbohaZjabbMha5jabb6gaUjabb+gaVjabb6gaUjabbMha5jabb2gaTbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@386E@

String hasSpan − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbofatjabbchaWjabbggaHjabb6gaUnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@39D1@

Labels hasSpanType → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbofatjabbchaWjabbggaHjabb6gaUjabbsfaujabbMha5jabbchaWjabbwgaLbWcbaWaa4akaWqabeaaaSGaayPKHaaabeaaaaa@3D30@

EntityType hasSpanType − 1 → MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiabbIgaOjabbggaHjabbohaZjabbofatjabbchaWjabbggaHjabb6gaUjabbsfaujabbMha5jabbchaWjabbwgaLnaaCaaaleqabaaccaGae8NeI0ccbaGae4xmaedaaaqaamaaoGcameqabaaaliaawkziaaqabaaaaa@3F31@

  1. The final set of features used by learning system, in the format " t → ℓ ' MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieGacqWFIaGicqWG0baDdaGdKaWcbaGaeS4eHWgabeGccaGLsgcacqGGNaWjaaa@3290@ , where t is a node type, and ℓ is an edge label. The edge labels likelyProtNER and possibleProtNER are shortcut representing the baseline NER methods. The edge labels likelyProt2syn and possibleProt2syn are links between extracted protein names and softTFIDF-similar protein synonyms.