Skip to main content

Table 12 Observation of keyword associations

From: Modeling and mining term association for improving biomedical information retrieval performance

#

unigram

bigram

trigram

 

T 1

..

T n

T n +1

..

T C n 1 + C n 2

T C n 1 + C n 2 + 1

..

T C n 1 + C n 2 + C n 3

 

k 1

..

k n

k 1 k 2

..

k n -1 k n

k 1 k 2 k 3

..

k n -2 k n -1 k n

1

1

..

1

0

..

1

0

..

1

2

1

..

1

1

..

1

1

..

1

..

 

.

.

 

.

.

 

.

.

N

0

..

0

0

..

1

0

..

1

  1. The keyword associations are observed: (1) 1-keyword subsequences are unigrams, 2-keyword subsequences are bigrams, 3-keyword subsequences are trigrams and so on; (2) all unigrams, bigrams, trigrams and ngrams are defined as terms; (3) a passage scores 1 if a term is appeared in it, otherwise it scores 0; (4) a passage is represented as a 1-0 vector.