Skip to main content

Table 1 Example of terms that occur most frequently in an article (selected from article with PMID: 10506131).

From: Building a protein name dictionary from full text: a machine learning term extraction approach

Term

Frequency

Category

GnRH

79

a

side chain

60

a

the GnRH receptor

30

a

d7.49 318

38

a

the

503

b

of

334

b

and

212

b

expression

49

b

d7.49

38

a

the GnRH

30

a

  1. Terms in italic are not considered for further analysis because their frequency in the document is the same as the frequency of a longer term that exactly contains them ("d7.49" is contained in "d7.49 318" and their frequencies are the same).
  2. This indicates that the shorter term never occurs alone in the document.