Skip to main content

Table 3 Emerging bigrams in the BI journal corpus.

From: SYMBIOmatics: Synergies in Medical Informatics and Bioinformatics – exploring current scientific literature for emerging topics

  

Rank

Rank

Rank

  

Rank

Rank

Rank

 

Df

emerging

2000–2005

1990–1999

 

Doc. Freq.

emerging

2000–2005

1990–1999

gene expression

711

 

1

1

microarray experiment

184

2

22

 

amino acid

490

 

2

 

not only

181

 

23

15

protein sequence

438

 

3

2

microarray data

169

3

25

 

expression datum

339

 

4

 

expression profile

168

4

26

 

sequence alignment

321

 

5

 

gene ontology

135

5

37

 

supplementary information

321

 

6

 

support vector

133

6

38

 

sequence alignment

321

  

3

vector machine

130

7

41

 

dna sequence

313

 

7

4

protein interaction

99

8

62

 

protein structure

313

 

8

5

whole genome

80

9

74

 

freely available

306

 

9

 

nucleotide polymorphism

76

10

80

 

binding site

295

 

10

6

cdna microarray

73

11

83

 

large number

288

 

11

7

microarray technology

73

12

84

 

microarray datum

268

1

12

 

microarray gene

66

13

85

 

neural network

250

 

13

8

data mining

60

14

87

 

secondary structure

246

 

14

9

interaction network

60

15

88

 

new method

244

 

15

10

     

data set

236

 

16

11

     

datum set

224

 

17

12

     

source code

208

 

18

13

     

markov model

187

 

21

14

     
  1. The table shows bigrams extracted from the BI journal corpus (col. 1) together with their document frequency (col. 2) and their ranks. The first rank refers to emerging bigrams (ref. to text, col. 3), the second rank is for bigrams with their highest document frequency during 2000–2005 (col. 4) and the last rank uses the highest document frequency during 1990–1999 (col. 5). The table shows that over the last five years new topics at a high frequency emerged.