Skip to main content

Table 5 Makeup of the focused subgraphs for selected queries.

From: Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities

Query Term Query Type Proteins Nucleic Acid Sequences Protein Families Structures Interactions Pathways Total Connected
autoimmune protein 58 324 1 0 4 0 387 111
  nucleic 58 369 0 0 0 0 427 150
cancer protein 830 37023 7 5 109 0 37974 1977
  nucleic 829 37300 0 1 1 0 38131 1920
stromelysin protein 46 566 3 2 0 0 617 76
ubiquitin protein 2372 28820 9 25 720 1 31947 6219
  1. For each query we list the number of instances of each Biozon data type included in the corresponding focused subgraph. The query types we experimented with are protein sequence and nucleic sequence. One might be surprised by the fact that a structure is included in the focused subgraph for the stromelysin-nucleic search since protein structures are not directly related to nucleic acid sequences. Such nodes are included by definition since they have the search term; however, they are ignored (along with all other nodes that do not have neighbors) in the subsequent eigencalculations and are assigned prominence values of 0. The 'Total' column indicates the number of nodes included in the focused subgraph, while the last column ('Connected') lists the number of nodes of the focused subgraph with at least one neighbor in the subgraph. Only nodes which have neighbors are included in our computations.