Skip to main content

Table 5 New bigrams in the BI journal corpus in recent years.

From: SYMBIOmatics: Synergies in Medical Informatics and Bioinformatics – exploring current scientific literature for emerging topics

New in 2004–2005

Df

New in 2002–2003

Df

New in 2000–2001

Df

protein background

16

false discovery

41

microarray datum

268

method conclusion

12

discovery rate

40

microarray experiment

183

annotation method

11

datum background

40

expression profile

168

dataset result

11

microarray study

36

microarray data

161

array cgh

10

text mining

35

gene ontology

135

protein localization

10

association study

28

support vector

133

organism database

10

r package

26

vector machine

130

ontology database

10

normalization method

25

protein interaction

99

biocreative task

9

multiple testing

23

nucleotide polymorphism

76

entity recognition

9

ontology term

22

cdna microarray

73

splicing event

8

go term

21

microarray technology

73

name recognition

8

gene list

20

microarray gene

65

lowess normalization

8

human protein

20

differential expression

59

anatomy ontology

7

biomedical text

19

open source

54

novo sequencing

7

complex disease

19

biological network

50

task 2

6

microarray result

18

microarray analysis

48

task 1a

6

homo sapiens

18

widely used

48

venn diagram

4

named entity

17

gene selection

46

database identifier

4

synonymous codon

16

interaction datum

37

  

gene clustering

16

system biology

34

  

mammalian genome

16

interacting protein

33

  

bioinformatics analysis

15

alternative splicing

32

  

haplotype block

14

oligonucleotide microarray

29

  

go annotation

13

related gene

27

  

two dataset

13

web application

27

  

expression result

13

biological sample

26

  

marker gene

12

expression value

23

  

dimensionality reduction

12

primer design

22

  1. The table shows bigrams from the BI journal corpus that were new during the period 2004–2005 (col. 1 and 2), the period 2003–2004 (col. 3 and 4) and the period 2000–2001 (col. 5 and 6). All bigrams were selected and ranked according to their document frequency value (ref. to text), which had to be above 3. During the time 2000–2001 a large number of bigrams referring to microarray experiments emerged. "task 1a" and "task 2" are exclusively linked to BioCreAtive. "false discovery" refers to false discovery rate (FDR) in DNA microarray analysis.