Skip to main content

Table 5 Makeup of the focused subgraphs for selected queries.

From: Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities

Query Term

Query Type

Proteins

Nucleic Acid Sequences

Protein Families

Structures

Interactions

Pathways

Total

Connected

autoimmune

protein

58

324

1

0

4

0

387

111

 

nucleic

58

369

0

0

0

0

427

150

cancer

protein

830

37023

7

5

109

0

37974

1977

 

nucleic

829

37300

0

1

1

0

38131

1920

stromelysin

protein

46

566

3

2

0

0

617

76

ubiquitin

protein

2372

28820

9

25

720

1

31947

6219

  1. For each query we list the number of instances of each Biozon data type included in the corresponding focused subgraph. The query types we experimented with are protein sequence and nucleic sequence. One might be surprised by the fact that a structure is included in the focused subgraph for the stromelysin-nucleic search since protein structures are not directly related to nucleic acid sequences. Such nodes are included by definition since they have the search term; however, they are ignored (along with all other nodes that do not have neighbors) in the subsequent eigencalculations and are assigned prominence values of 0. The 'Total' column indicates the number of nodes included in the focused subgraph, while the last column ('Connected') lists the number of nodes of the focused subgraph with at least one neighbor in the subgraph. Only nodes which have neighbors are included in our computations.