Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: GenoLink: a graph-based querying and browsing system for investigating the function of genes and proteins

Figure 4

GenoLink default data model and example of a data graph. (a) Excerpt of the UML diagram describing the main classes and associations of the default data model provided with GenoLink. Classes are indicated by boxes (white arrows indicate inheritance) and association names are indicated in italics. For clarity, class and association attributes have not been indicated (an example is shown to the right part of the figure, with the Polypeptide class). The complete diagram is distributed with the GenoLink software documentation. (b) An example of data graph based on this data model. It represents a portion of the genome of the bacterium Helicobacter pylori strain 26695 (NCBI RefSeq entry no. NC000915); IRO, ILO, ICF, IIG, HPA, CD and HPIW stand for edges that are instances of associations: IsRepliconOf, IsLocatedOn, IsCodingFor, IsInGeneOrtholog, HasPolypeptideAnnotation, ContainsDomain and HasPhysicalInteractionWith. The entire data graph for this genome actually contains 3197 vertices (1 Organism, 1 Replicon, 1576 ProteinGenes, 43 RNAGenes, 1576 Polypeptides) and 4664 edges (1 IsRepliconOf, 1619 IsLocatedOn, 1576 IsCodingFor and 1468 HasPhysicalInteractionWith). The dashed box displays the attributes for the Polypeptide ureB. COG, EC and IPR data are from the COG database [21], the Enzyme Commission database [24], and the InterPro database [23], respectively. Protein-protein interactions are public data available from Hybrigenics [30] and distributed with GenoLink.

Back to article page