Skip to main content

Table 3 Coverage of LMO terminology in selected document sets.

From: Terminologies for text-mining; an experiment in the lipoprotein metabolism domain

 

LMO terminology predicted by TFIDF

LMO terminology literally contained

 

1000

all

  

300 review abstracts for “lipoprotein metabolism”

8.75%

15.35%

20.98%

 

3,066 abstracts for “lipoprotein metabolism”

14.99%

38.25%

53.00%

 

50,000 abstracts containing “lipoprotein”

  

71.22%

 
The table sets the upper limit of terms that can be found with text-mining: Even a large text base with 50,000 documents contains only 71% of LMO terms. TFIDF can predict up to 38% of LMO terms.