Skip to main content

Table 2 Given a reference r and a gene-disease pair <g , d >, CRFref estimates and integrates three measures: degrees of conclusiveness , richness , and focus of r with respect to <g , d>

From: Identification of highly related references about gene-disease association

Factors

Definition

Type

(1) Length(r)

1 if length of r > AvgLe n a L ength of r AvgLen otherwise

Degree of conclusiveness

(2) GeneTF(g,r)

1 if TF g , r > 5 b TF g , r 5 otherwise

(3) DiseaseTF(d,r)

1 if TF d , r > 5 TF d , r 5 otherwise

(4) Gene@Title(g,r)

1 if g appears in the title of r 0 otherwise

(5) Disease@Title(d,r)

1 if d appears in the title of r 0 otherwise

(6) Gene@Ending(g,r)

LastPos g , r c length of r

(7) Disease@Ending(d,r)

LastPos d , r length of r

(8) NotGeneNum(g,r)

1 if G ' > 5 G ' 5 otherwise G ' = g ' | g ' G g and appears in r d

Degree of richness

(9) NotDiseaseNum(d,r)

1 if | D ' | > 5 D ' 5 otherwise D ' = d ' | d ' D d and appears in r e

(10) NotGene@Title(g,r)

1 if there is a gene g ' G g that appears in the title of r 0 otherwise

Degree of focus

(11) NotDisease@Title(d,r)

1 if there is a disease d ' D d that appears in the title of r 0 otherwise

(12) NotGene@Ending(g,r)

Ma x g ' G g and appears in r LastPos g ' , r length of r

(13) NotDisease@Ending(d,r)

Ma x d ' D d and appears in r LastPos d ' , r length of r

  1. [a]AvgLen is the average length of references.
  2. [b]TF(x,r): Term frequency of x in r.
  3. [c]LastPos(x,r): The last position of x in r.
  4. [d]G: Set of gene names in HUGO Gene Nomenclature Committee (HGNC).
  5. [e]D: Set of terms in MeSH class of C04 to C26, with ‘disease’ and ‘syndrome’ removed.