- Research article
- Open Access
Integration of the Gene Ontology into an object-oriented architecture
© Shegogue and Zheng; licensee BioMed Central Ltd. 2005
Received: 30 December 2004
Accepted: 10 May 2005
Published: 10 May 2005
To standardize gene product descriptions, a formal vocabulary defined as the Gene Ontology (GO) has been developed. GO terms have been categorized into biological processes, molecular functions, and cellular components. However, there is no single representation that integrates all the terms into one cohesive model. Furthermore, GO definitions have little information explaining the underlying architecture that forms these terms, such as the dynamic and static events occurring in a process. In contrast, object-oriented models have been developed to show dynamic and static events. A portion of the TGF-beta signaling pathway, which is involved in numerous cellular events including cancer, differentiation and development, was used to demonstrate the feasibility of integrating the Gene Ontology into an object-oriented model.
Using object-oriented models we have captured the static and dynamic events that occur during a representative GO process, "transforming growth factor-beta (TGF-beta) receptor complex assembly" (GO:0007181).
We demonstrate that the utility of GO terms can be enhanced by object-oriented technology, and that the GO terms can be integrated into an object-oriented model by serving as a basis for the generation of object functions and attributes.
Complexity combined with an imprecise terminology has hindered the understanding of biology. A formal and structured vocabulary is now being developed to address this imprecise biology terminology. This vocabulary or Gene Ontology (GO) is being developed by the Gene Ontology Consortium (GOC)  to standardize the descriptions of gene products. Ontologies define the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary . Despite these efforts, the mechanism of representing these terms lacks a unifying architecture that can be applied to the annotation of a gene product. However, computer science has developed a well-defined process and methodology for the development of software models. Adapting this process and methodology can orchestrate the assembly of biological models with integrated gene ontologies. In doing so, a standardized terminology and object-oriented model is created that can facilitate communication between biologists and computer scientists.
The Gene Ontology project is a collaborative effort that addresses the need for a controlled vocabulary that provides a consistent description of gene products in different databases . The GO collaborators are developing three structured, controlled vocabularies that describe gene products, which have been classified into molecular function, biological process, and cellular component domains. GO terms are organized in structures called directed acyclic graphs (DAGs), which differ from hierarchies in that a 'child' (more specialized term) can have many 'parents' (less specialized terms). As part of these graphs, each component is given a GOid (unique identifier), and is associated with a GO definition. Collectively, these agreed upon terms are being developed to help explain various aspects of biology. When applied to a gene, that gene is annotated with a concise description of its molecular function, cellular location and associated biological processes. However, the GOC never intended to represent gene products or correlate ontological terms with these gene products . To address this need, a Gene Ontology Annotation database  has been created to associate the GO terms with their gene product counterparts. With sustained effort, the descriptions of these gene products will ultimately be established. Still, much of the current bioinformatics work regarding GO has focused on constructing databases [4–7], applying it to other research areas [8–22], and building tools to mine the GO database. (For a description of some of these tools see .)
In addition, there has been an ongoing discussion regarding the depth of information obtained from the Gene Ontology . It has been noted that there remains a need for a unifying architecture that integrates all three GO domains as part of a gene product's annotation. Furthermore, to enhance the Gene Ontology and facilitate its use as a cross-disciplinary tool, several additional issues need to be addressed. First, relationships between the biological processes, molecular functions and cellular components are not readily apparent [25–28]. Second, GO terms lack details. For instance, when one looks at molecular function there is no indication of what is inputted or outputted. Finally, existing tools such as GO-DEV  only contain software used for tool development and information retrieval, not software modeled directly after the three domains of the Gene Ontology. However, these issues can be resolved by integrating the Gene Ontology into an object-oriented system.
The use of object-oriented concepts in the integration of the Gene Ontology into an object-oriented model. Object-oriented terms, their definitions, and corresponding mechanisms of incorporating GO terms into an object-oriented model are shown. A specific example from the manuscript is also given. GO, Gene Ontology; DAG, directed acyclic graph; OOM, object-oriented model
Object-Oriented Definition *
Object-Oriented use of the GO
A class is a template from which object instances are created. It specifies the common characteristics that objects created from it will contain
Classes are created from gene products whose characteristics are defined by the GO molecular function and cellular component terms
The class Smad 2 is created based on the properties of the gene product Smad 2, which are defined by molecular functions such as "protein homodimerization' (GO:0042803) and 'ATP binding' (GO:0042301)
An instance of a class that contains unique properties
Objects are created from the template classes, but may contain properties unique to a particular object
Two different Smad 2 objects may be created, one of which is phosphorylated, and one which is not
Relationships between classes, whereby a more specific class inherits all the properties and methods of the classes they belong to
Relationships defined by 'is a' are generalizations in which child classes of the DAG inherit the properties of the parent class (if a child class has multiple parent classes, multiple inheritance applies)
The cellular component 'plasma membrane' (GO:0005886) inherits the properties of the general class cellular component 'membrane' (GO:0016020)
Certain objects may be assembled from collections of other objects
'part_of' relationships defined in the GO DAG are rendered as composition relationships in an OOM
The 'membrane' (GO:0005623) and 'intracellular' (GO:0005622) space are part of the 'cell' (GO:0005623)
The ability of an object to interpret messages differently when received by different objects
GO functions may change for different proteins and be given different input and output values
The function 'protein homodimerization activity' (GO:0042803) in the context of SMAD2 accepts two SMAD2s and outputs a dimerized SMAD2, whereas in the context of TGF-beta receptor II it accepts two receptors and outputs a dimerized receptor
Hiding the state and implementation of an object
The exact mechanism by which an object is created is hidden in an OOM
The details involved in the translation (GO: 0043037) of Smad 2 are hidden, but a Smad 2 molecule is still created
The functions of gene products are the jobs or abilities that it has. In the GO terminology these are described in the molecular function domain. These are analogous to the operations that an object can perform in an object-oriented paradigm. Attributes, which define key properties of a component that when changed may alter the function of that component, may be defined by the cellular component and molecular function sections. For example, the cellular component domain can specify the place in a cell where a gene product is located. When there are multiple cellular components associated with a gene product, however, there is currently no mechanism to designate which cellular component represents the appropriate location.
The unified modeling language has been used to capture various aspects of biology [30–32]. These examples highlight the utility of the unified modeling language as a tool for biological data integration, and indicate that it can be applied to construct large, complex biological models. Therefore, to demonstrate the feasibility of integrating the Gene Ontology into an object-oriented model we have created unified modeling language (UML) representations of a GO biological process, "transforming growth factor beta (TGF-beta) receptor complex assembly" (GO:0007181).
The TGF-beta receptor pathway is involved in numerous cellular events including apoptosis, tumor development, differentiation, and development. These processes stem from the binding of TGF-beta to its cellular receptors. Briefly, dimerized TGF-beta 1 binds to TGF-beta receptor II (TβRII) and then TGF-beta receptor I (TβRI) complexes , causing their tetramerization (two type II receptors and two type I receptors) [34–36]. Constitutively activated type II receptor phosphorylates and activates type I receptor. Type I receptor propagates the signal by phosphorylating Smad 2, which is presented by the Smad Anchor for Receptor Activation (SARA) . Phosphorylation of Smad destabilizes the Smad interaction with SARA, releasing it . On TGF-beta stimulation, Smad 2 forms heterotrimeric complexes with Smad 4 and accumulates in the nucleus, binds DNA and remains for several hours [39–42]. Dephosphorylation allows Smad 2 to dissociate from Smad 4 and to be exported to the cytoplasm [43, 44]. If the receptors are no longer active, then the Smads accumulate over time in the cytoplasm . Alternatively, activated Smad 2 is ubiquitinated in the nucleus and undergoes proteasome-mediated degradation .
To create a unified model using the Gene Ontology we have taken the biological process term, "transforming growth factor beta (TGF-beta) receptor complex assembly" (GO:0007181), and used object-oriented models to define its dynamic and static architecture. We also show that one can augment the biological process domain terms by using the ontological terms and gene products associated with this process, and integrating them into an object-oriented model. Furthermore, we show that the molecular function, and cellular component domains can serve as a basis for the generation of object functions and attributes to create a standardized, comprehensive, and integrated model encompassing all the Gene Ontology domains.
Converting GO directed acyclic graphs to object-oriented diagrams
The gene product functions described herein are listed with their associated GO molecular functions and parameters. These gene product functions are mapped to corresponding Gene Ontology molecular functions. These GO functions are integrated into an object-oriented model by amending them with input and output parameters, thereby creating object functions.
Gene Product Function
Corresponding GO Term and GO ID
protein homodimerization activity (GO:0042803)
bind TGF-beta receptor
TGF-beta receptor binding (GO:0005160)
protein homodimerization activity (GO:0042803)
TGF-beta binding (GO:0060431)
protein heterodimerization activity (GO:0046982)
transferase activity (GO:0016740)
protein homodimerization activity (GO:0042803)
protein heterodimerization activity (GO:0046982)
phosphate binding (GO:0042301)
Smad binding (GO:0046332)
transferase activity (GO:0016740)
TGF-beta receptor binding (GO:0005160)
phosphate binding (GO:0042301)
protein heterodimerization activity (GO:0042803)
DNA binding (GO:0003677)
We conclude that it is feasible to create standardized functions for objects based on the current literature and an approved ontology. Together, ontological terms can be integrated into an object-oriented model paralleling the relationships, capturing the inherited aspects of the GO terminology, and providing a compact architecture while maintaining a standardized notation.
Sequence diagram generation
Activity diagram generation
Class diagram generation
In addition, the UML notation provides a mechanism to specify inheritance that may be used to indicate an object that is the foundation for other objects. For instance, a TGF-beta receptor object might be a generalization of the TGF-beta receptor I object (data not shown). These specific objects inherit the properties of the receptor object. In addition, binary associations containing cardinalities may indicate the number of objects interacting with another. For instance, TGF-beta can interact with one to many receptors, while a receptor can only interact with one TGF-beta at a time (Fig. 4). Cellular compartments where these gene products can be found are also shown. Here, guard conditions are added to distinguish conditions under which each gene product might be found in a particular cellular compartment. In this way, a spatial representation of the TGF-beta receptor complex components is also achieved. These class diagrams demonstrate that the static structure of a biological system can be represented as an object-oriented model with integrated Gene Ontology terms. Collectively, the models generated using the described object-oriented methodology yield a software system representation of a biological system, capturing both static and dynamic relationships annotated with integrated Gene Ontology terms.
We have utilized the Gene Ontology to construct an object-oriented representation of the initial steps of TGF-beta signaling, and the gene products contained therein. In doing so, we have provided a standardized framework for the integration of Gene Ontology terms into gene product descriptions. By capturing all of the relevant GO terms in one model, the disjointed GO vocabulary is assembled into a cohesive structure. This cohesive structure encompasses the fundamental concepts of the object-oriented paradigm.
We proposed a solution to three unaddressed issues within the current Gene Ontology. First, while the Gene Ontology has helped to formalize the vocabulary that describes biological systems, it lacks a specific integration method. Currently, when applied to gene products, Gene Ontology terms are only categorically listed. Second, the Gene Ontology domains, biological process, molecular function and cellular component lack coherence. In particular, no association exists between domains. Finally, the current Gene Ontology defines GO terms, but gives no indication of what is necessary to accomplish a particular function, or process. To resolve these problems we defined an object-oriented methodology and architecture that provides a unifying framework to integrate all Gene Ontology domains.
The central dogma of the object-oriented paradigm revolves around several key aspects. Specifically, an object-oriented framework should accommodate the class, object, inheritance, composition, encapsulation and polymorphism concepts. As shown in table 1, gene products and other bioentities can be decomposed into objects, which are created based on template classes. These objects utilize inheritance to acquire the attributes and properties of more general objects. Complex classes can also be disassembled into subclasses using composition. Encapsulation allows the simplification of the model without sacrificing functionality. For instance, we do not need to know specific details regarding how a gene product is translated, just that a process that is encapsulated by the function 'translate()' can create a protein. However, if we wished to delve deeper into the mechanics of the translation process the layered architecture of the object-oriented system would allow us to do so. It is also worth noting that the modular nature of the object-oriented system closely resembles the recently discovered modular structure of biological networks [46–48]. This resemblance further indicates that biological systems can be easily modeled as object-oriented systems. Finally, polymorphism allows one to describe shared functions among different gene products. In this way, a function that may be shared broadly with other gene products can be uniquely specified for a particular gene product.
By applying object-oriented methodologies and concepts the various domains of the Gene Ontology can be coordinated into one model. Currently, the mechanisms in the biological process domain are veiled. There is no indication as to what gene products form the biological process, or what molecular functions are necessary to accomplish the process. Furthermore, the outcome of a specific process is not obvious. As in our example, a process such as TGF-beta receptor complex assembly (GO:0007181) does not give any indication of the components, dynamics or outcomes that occur during this process. However, by incorporating GO terms as attributes and functions we can discern relationships between the three domains. Likewise, the cellular components domain does not provide temporal or spatial clues when applied to gene products. For instance, GO terms 'extracellular' and 'intracellular' may both be associated with a particular gene product. However, the distinction between when a gene product is extracellular and when it is intracellular is not apparent. By applying object-oriented principles we can set extracellular and intracellular to Boolean values, and we can specify which location is the current (true) location of a gene product.
In addition, by using object-oriented principles a GO molecular function term can be augmented with parameters and outcomes. For example, the function "GO:0046982: protein heterodimerization activity" has different input and output parameters depending on the particular protein that contains the function. This type of polymorphic behavior, where one function can be performed in multiple ways is not supported by the Gene Ontology. For example, protein A may heterodimerize with protein B, whereas protein C heterodimerizes with protein D. From the Gene Ontology it is not readily apparent as to what is being inputted into the dimerization function. However, by applying an object-oriented architecture to function "GO:0046982: protein heterodimerization activity" we get "GO:0046982 (in: Protein A, in: Protein B): Protein AB". This format is an improvement to the unparameterized GO term in that the function can be cross-referenced to protein heterodimerization activity via its GO term, and we also see that for protein A to heterodimerize we need both protein A and protein B. In addition, we now observe that a new entity called protein AB is created from this function. By capturing the above details in an object-oriented model the GO term becomes far more useful for both biologists and computer scientists. Using an object-oriented approach the Gene Ontology domains are integrated into one cohesive model.
Integration of the Gene Ontology terms into an object-oriented representation offers several additional benefits. The object-oriented model provides additional levels of detail not found in the Gene Ontology. One of the strengths of object-oriented technology is the ability to capture the dynamics of a system. For example, sequence diagrams can chronologically order events in a biological process. Activity diagrams afford one the opportunity to envision different scenarios that might be occurring in a process. This additional level of detail significantly increases the depth of information that can be applied to the description of a biological process. State-transition diagrams also contribute to the realization of the full dynamics of a process by allowing the visualization of gene product states within a process. Furthermore, UML models can be translated into code, facilitating the creation of simulations.
The standardization of biological system modeling and integration is growing rapidly. A widely accepted example of the drive toward standardization is the Systems Biology Markup Language (SBML) , which has been adopted by more than 70 software tools . The Gene Ontology is another example. However, each of the technologies, the Gene Ontology, the object-oriented approach, and SBML, has strengths and weaknesses. The Gene Ontology provides a standardized vocabulary but contains disconnected domains with no details regarding terms. SBML was developed to communicate biological models, with an emphasis on mathematical modeling of biological systems, but does not specify how to construct these models. Object-oriented technologies, on the other hand, provide a well-defined process for model creation and visualization, but have not been standardized for biology. However, the Gene Ontology, object-oriented paradigm, and SBML can form a new synergism when jointly applied to a common biological system model. These technologies are steps toward a unified approach to biological information integration, and studying biological phenomena at the systems level. Together, this unified approach will make biological system integration and analysis consistent, manageable and controllable, which is essential in handling complex systems, as demonstrated by decades of software industry experience.
While the described object-oriented approach can significantly enhance the annotation of gene products using the Gene Ontology, several challenges will need to be addressed. Specifically, object-orientation was not specifically designed for use in biological systems. Therefore, its use in capturing biological systems is not well defined. Furthermore, the Gene Ontology is still expanding and undergoing revisions. Consequently, in the near future it will still be necessary to do literature searches to define all the gene ontologies associated with a gene product. However, automated extraction of information for UML model generation and software implementation for simulations is under development, but is beyond the scope of this paper.
Future systems may also be implemented as software libraries in object-oriented programming languages (C++ and Java) for computer scientists to construct software for various applications and can be distributed as part of the GO-DEV toolkit for Gene Ontology development . In addition, reformatting gene products with Gene Ontology terms will require the cooperation of multiple groups of biologists and computer scientists. However, we must take into consideration that a primary issue with this approach is the lack of people with cross-disciplinary skills able to comprehend both the biology and the computer science. Nonetheless, our own experience has shown that with supervision one biologist without a formal computer science background can learn to model a biological system using UML in a matter of months. Furthermore, automation of some of the annotation process will significantly reduce the human effort, but not eliminate the need for human annotators. Additional standards for automation will also need to be developed to thoroughly specify the process of object-oriented biological system integration. Despite these challenges the ultimate goal of creating a library of UML objects or modules integrated with Gene Ontology attributes and functions is worthwhile. Through this endeavor, biological processes could be assembled from these libraries for the development of simulation tools that will increase the productivity of biologists through increased insight into disease pathways and mechanisms.
Here, we have demonstrated that Gene Ontology terms can be integrated into an object-oriented model. Furthermore, the object-oriented technology and methodologies used for this integration should improve the usability of these terms, and increase the depth of information that they contain. This work also serves as a framework for reverse-engineering biological gene products as objects in an object-oriented system. Together, this should facilitate additional collaborations between biologists and computer scientists.
UML representations of the TGF-beta receptor complex assembly process were created following a software engineering process consisting of phases of requirement-gathering, analysis and design. UML models were generated using Microsoft Visio Pro. AmiGO  was used to determine Gene Ontology links for TGF-beta gene products.
To define the requirements and collect the information necessary for the generation of the models, two approaches were necessary. First, annotations of the TGF-beta signaling pathways were conducted during an extensive literature review. Second, gene ontologies and Uniprot entries were searched to assign Gene Ontology terms to gene products. The attributes and the interactions of the TGF-beta signaling components were captured using class-responsibility collaboration (CRC) cards as described previously  [see Additional files 1, 2, 3, 4].
Use case development
Based on the gathered information, best-case and alternative scenarios were developed within a so-called "use case" to describe the TGF-beta receptor complex assembly process [see Additional file 5]. The use case also serves to define the boundary and scope of the TGF-beta model. For demonstration purposes the boundary of the system was limited to the steps TGF-beta receptor complex assembly. Therefore, alternative events such as receptor ubiquitination and degradation, as well as the specifics of SMAD 2 mobility were not captured in the dynamic models (i.e. sequence diagram).
Conceptual model generation
To provide an overview of the system and its interrelationships a conceptual based on the information defined in the requirement-gathering phase was generated [see Additional file 6]. This conceptual model integrated biological information, and represented TGF-beta and the cellular components involved in the complex assembly and their relationships in UML notation. By applying object-oriented analysis, the TGF-beta receptor complex assembly was decomposed into objects and component relationships were realized. However, information regarding component properties is hidden through encapsulation. This conceptual model defines the organization of the biological system and provides an overview of the components and their relationships.
State diagram generation
The dynamics of the system can also be captured using state diagrams, which can be used to describe the transitions and different states that a cellular component can exist [see Additional file 7]. In addition, multiple concurrent states can be illustrated using this UML notation.
Sequence, activity, class diagram generation
Sequence, activity and class diagrams have been used as an example to demonstrate the feasibility of generating an object-oriented representation of the biological process described by the GO term TGF-beta receptor complex assembly (GO:0007181), with Gene Ontology terms applied to generate these diagrams. Objects representing corresponding gene products are created, and their essential attributes are captured. Interactions among objects are also identified. For each interaction, a corresponding method is generated. This method is matched to a Gene Ontology term. The nature of the interaction determines the method parameters. The sequence of events is captured, and used to generate sequence diagrams. Scenarios are also generated for object interactions, and used to generate activity diagrams. The information captured in the sequence diagram and activity diagrams are used, along with the gene products attributes, to generate class diagrams.
Daniel Shegogue is supported by NLM training grant 5-T15-LM007438-02. W. Jim Zheng is partly supported by a grant (DE-FG02-01ER63121) from the Department of Energy.
- Gene Ontology Consortium[http://www.geneontology.org/]
- Lambrix P, Habbouche M, Perez M: Evaluation of ontology development tools for bioinformatics. Bioinformatics 2003, 19(12):1564–1571. 10.1093/bioinformatics/btg194View ArticlePubMedGoogle Scholar
- Gene Ontology Annotation[http://www.ebi.ac.uk/goa]
- Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucl Acids Res 2004, 32(90001):D262–266. 10.1093/nar/gkh021PubMed CentralView ArticlePubMedGoogle Scholar
- Dwight SS, Harris MA, Dolinski K, Ball CA, Binkley G, Christie KR, Fisk DG, Issel-Tarver L, Schroeder M, Sherlock G, Sethuraman A, Weng S, Botstein D, Cherry JM: Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res 2002, 30(1):69–72. 10.1093/nar/30.1.69PubMed CentralView ArticlePubMedGoogle Scholar
- Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, Sethuraman A, Theesfeld CL, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee SY, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz EM, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 2004, (32 Database):D258–261.Google Scholar
- Lu P, Szafron D, Greiner R, Wishart DS, Fyshe A, Pearcy B, Poulin B, Eisner R, Ngo D, Lamb N: PA-GOSUB: a searchable database of model organism protein sequences with their predicted Gene Ontology molecular function and subcellular localization. Nucleic Acids Res 2005, (33 Database):D147–153.Google Scholar
- Adryan B, Schuh R: Gene-Ontology-based clustering of gene expression data. Bioinformatics 2004, 20(16):2851–2852. 10.1093/bioinformatics/bth289View ArticlePubMedGoogle Scholar
- Ahn WS, Kim KW, Bae SM, Yoon JH, Lee JM, Namkoong SE, Kim JH, Kim CK, Lee YJ, Kim YW: Targeted cellular process profiling approach for uterine leiomyoma using cDNA microarray, proteomics and gene ontology analysis. Int J Exp Pathol 2003, 84(6):267–279. 10.1111/j.0959-9673.2003.00362.xPubMed CentralView ArticlePubMedGoogle Scholar
- Arciero C, Somiari SB, Shriver CD, Brzeski H, Jordan R, Hu H, Ellsworth DL, Somiari RI: Functional relationship and gene ontology classification of breast cancer biomarkers. Int J Biol Markers 2003, 18(4):241–272.PubMedGoogle Scholar
- Badea L: Functional discrimination of gene expression patterns in terms of the gene ontology. Pac Symp Biocomput 2003, 565–576.Google Scholar
- Chou KC, Cai YD: A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. Biochem Biophys Res Commun 2003, 311(3):743–747. 10.1016/j.bbrc.2003.10.062View ArticlePubMedGoogle Scholar
- Deng M, Tu Z, Sun F, Chen T: Mapping Gene Ontology to proteins based on protein-protein interaction data. Bioinformatics 2004, 20(6):895–902. 10.1093/bioinformatics/btg500View ArticlePubMedGoogle Scholar
- Feng W, Wang G, Zeeberg BR, Guo K, Fojo AT, Kane DW, Reinhold WC, Lababidi S, Weinstein JN, Wang MD: Development of gene ontology tool for biological interpretation of genomic and proteomic data. AMIA Annu Symp Proc 2003, 839.Google Scholar
- Jensen LJ, Gupta R, Staerfeldt HH, Brunak S: Prediction of human protein function according to Gene Ontology categories. Bioinformatics 2003, 19(5):635–642. 10.1093/bioinformatics/btg036View ArticlePubMedGoogle Scholar
- Lagreid A, Hvidsten TR, Midelfart H, Komorowski J, Sandvik AK: Predicting gene ontology biological process from temporal gene expression patterns. Genome Res 2003, 13(5):965–979. 10.1101/gr.1144503View ArticlePubMedGoogle Scholar
- Li S, Becich MJ, Gilbertson J: Microarray data mining using gene ontology. Medinfo 2004, 11: 778–782.Google Scholar
- Lu X, Zhai C, Gopalakrishnan V, Buchanan BG: Automatic annotation of protein motif function with Gene Ontology terms. BMC Bioinformatics 2004, 5(1):122. 10.1186/1471-2105-5-122PubMed CentralView ArticlePubMedGoogle Scholar
- Masseroli M, Martucci D, Pinciroli F: Towards biological knowledge mining by statistical analysis of gene ontology annotations. Medinfo 2004, 2004(CD):1745.Google Scholar
- Pinto FR, Cowart LA, Hannun YA, Rohrer B, Almeida JS: Local correlation of expression profiles with gene annotations – proof of concept for a general conciliatory method. Bioinformatics 2005, 21: 1037–1045. 10.1093/bioinformatics/bti074View ArticlePubMedGoogle Scholar
- Schug J, Diskin S, Mazzarelli J, Brunk BP, Stoeckert CJ Jr: Predicting gene ontology functions from ProDom and CDD protein domains. Genome Res 2002, 12(4):648–655. 10.1101/gr.222902PubMed CentralView ArticlePubMedGoogle Scholar
- Vinayagam A, Konig R, Moormann J, Schubert F, Eils R, Glatting KH, Suhai S: Applying Support Vector Machines for Gene Ontology based gene function prediction. BMC Bioinformatics 2004, 5(1):116. 10.1186/1471-2105-5-116PubMed CentralView ArticlePubMedGoogle Scholar
- Gene Ontology Tools[http://www.geneontology.org/GO.tools.shtml]
- Ashburner M, Mungall CJ, Lewis SE: Ontologies for biologists: a community model for the annotation of genomic data. Cold Spring Harb Symp Quant Biol 2003, 68: 227–235. 10.1101/sqb.2003.68.227View ArticlePubMedGoogle Scholar
- Zhang S, Bodenreider O: Comparing Associative Relationships among Equivalent Concepts Across Ontologies. Medinfo 2004, 11: 459–466.Google Scholar
- Smith B, Williams J, Schulze-Kremer S: The ontology of the gene ontology. AMIA Annu Symp Proc 2003, 609–613.Google Scholar
- Ogren PV, Cohen KB, Acquaah-Mensah GK, Eberlein J, Hunter L: The compositional structure of Gene Ontology terms. Pac Symp Biocomput 2004, 214–225.Google Scholar
- Smith B, Kumar A: Controlled vocabularies in bioinformatics: a case study in the gene ontology. DDT: BIOSILICO 2004, 2(6):246–252. 10.1016/S1741-8364(04)02424-2Google Scholar
- Taylor CF, Paton NW, Garwood KL, Kirby PD, Stead DA, Yin Z, Deutsch EW, Selway L, Walker J, Riba-Garcia I, Mohammed S, Deery MJ, Howard JA, Dunkley T, Aebersold R, Kell DB, Lilley KS, Roepstorff P, Yates JR 3rd, Brass A, Brown AJ, Cash P, Gaskell SJ, Hubbard SJ, Oliver SG: A systematic approach to modeling, capturing, and disseminating proteomics experimental data. Nat Biotechnol 2003, 21(3):247–254. 10.1038/nbt0303-247View ArticlePubMedGoogle Scholar
- Spellman PT, Miller M, Stewart J, Troup C, Sarkans U, Chervitz S, Bernhart D, Sherlock G, Ball C, Lepage M, Swiatek M, Marks WL, Goncalves J, Markel S, Iordan D, Shojatalab M, Pizarro A, White J, Hubley R, Deutsch E, Senger M, Aronow BJ, Robinson A, Bassett D, Stoeckert CJ Jr, Brazma A: Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol 2002, 3(9):RESEARCH0046. 10.1186/gb-2002-3-9-research0046PubMed CentralView ArticlePubMedGoogle Scholar
- Shegogue D, Zheng WJ: Object-oriented biological system integration: a SARS coronavirus example. Bioinformatics 2005, 21: 2502–9. 10.1093/bioinformatics/bti344View ArticlePubMedGoogle Scholar
- Rodriguez C, Chen F, Weinberg RA, Lodish HF: Cooperative binding of transforming growth factor (TGF)-beta 2 to the types I and II TGF-beta receptors. J Biol Chem 1995, 270(27):15919–15922. 10.1074/jbc.270.27.15919View ArticlePubMedGoogle Scholar
- Brown CB, Boyer AS, Runyan RB, Barnett JV: Requirement of type III TGF-beta receptor for endocardial cell transformation in the heart. Science 1999, 283(5410):2080–2082. 10.1126/science.283.5410.2080View ArticlePubMedGoogle Scholar
- Massague J: TGF-beta signal transduction. Annu Rev Biochem 1998, 67: 753–791. 10.1146/annurev.biochem.67.1.753View ArticlePubMedGoogle Scholar
- Yamashita H, ten Dijke P, Franzen P, Miyazono K, Heldin CH: Formation of hetero-oligomeric complexes of type I and type II receptors for transforming growth factor-beta. J Biol Chem 1994, 269(31):20172–20178.PubMedGoogle Scholar
- Tsukazaki T, Chiang TA, Davison AF, Attisano L, Wrana JL: SARA, a FYVE domain protein that recruits Smad2 to the TGFbeta receptor. Cell 1998, 95(6):779–791. 10.1016/S0092-8674(00)81701-8View ArticlePubMedGoogle Scholar
- Xu L, Chen YG, Massague J: The nuclear import function of Smad2 is masked by SARA and unmasked by TGFbeta-dependent phosphorylation. Nat Cell Biol 2000, 2(8):559–562. 10.1038/35019649View ArticlePubMedGoogle Scholar
- Inman GJ, Hill CS: Stoichiometry of active smad-transcription factor complexes on DNA. J Biol Chem 2002, 277(52):51008–51016. 10.1074/jbc.M208532200View ArticlePubMedGoogle Scholar
- Dennler S, Itoh S, Vivien D, ten Dijke P, Huet S, Gauthier JM: Direct binding of Smad3 and Smad4 to critical TGF beta-inducible elements in the promoter of human plasminogen activator inhibitor-type 1 gene. Embo J 1998, 17(11):3091–3100. 10.1093/emboj/17.11.3091PubMed CentralView ArticlePubMedGoogle Scholar
- Yingling JM, Datto MB, Wong C, Frederick JP, Liberati NT, Wang XF: Tumor suppressor Smad4 is a transforming growth factor beta-inducible DNA binding protein. Mol Cell Biol 1997, 17(12):7019–7028.PubMed CentralView ArticlePubMedGoogle Scholar
- Zawel L, Dai JL, Buckhaults P, Zhou S, Kinzler KW, Vogelstein B, Kern SE: Human Smad3 and Smad4 are sequence-specific transcription activators. Mol Cell 1998, 1(4):611–617. 10.1016/S1097-2765(00)80061-1View ArticlePubMedGoogle Scholar
- Xu L, Kang Y, Col S, Massague J: Smad2 nucleocytoplasmic shuttling by nucleoporins CAN/Nup214 and Nup153 feeds TGFbeta signaling complexes in the cytoplasm and nucleus. Mol Cell 2002, 10(2):271–282. 10.1016/S1097-2765(02)00586-5View ArticlePubMedGoogle Scholar
- Inman GJ, Nicolas FJ, Hill CS: Nucleocytoplasmic shuttling of Smads 2, 3, and 4 permits sensing of TGF-beta receptor activity. Mol Cell 2002, 10(2):283–294. 10.1016/S1097-2765(02)00585-3View ArticlePubMedGoogle Scholar
- Lo RS, Massague J: Ubiquitin-dependent degradation of TGF-beta-activated smad2. Nat Cell Biol 1999, 1(8):472–478. 10.1038/70258View ArticlePubMedGoogle Scholar
- Papin JA, Reed JL, Palsson BO: Hierarchical thinking in network biology: the unbiased modularization of biochemical networks. Trends Biochem Sci 2004, 29(12):641–647. 10.1016/j.tibs.2004.10.001View ArticlePubMedGoogle Scholar
- Bolouri H, Davidson EH: Modeling transcriptional regulatory networks. Bioessays 2002, 24(12):1118–1129. 10.1002/bies.10189View ArticlePubMedGoogle Scholar
- Wolf DM, Arkin AP: Motifs, modules and games in bacteria. Curr Opin Microbiol 2003, 6(2):125–134. 10.1016/S1369-5274(03)00033-XView ArticlePubMedGoogle Scholar
- Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr JH, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novere N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, Sakurada T, Schaff JC, Shapiro BE, Shimizu TS, Spence HD, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 2003, 19(4):524–531. 10.1093/bioinformatics/btg015View ArticlePubMedGoogle Scholar
- Finney A, Hucka M: Systems biology markup language: Level 2 and beyond. Biochem Soc Trans 2003, 31(Pt 6):1472–1473.View ArticlePubMedGoogle Scholar
- Shegogue D, Zheng WJ: Capturing biological information with class-responsibility-collaboration cards. Bioinformatics 2005, 21: 415. 10.1093/bioinformatics/bti005View ArticlePubMedGoogle Scholar
- Graham I: Basic Concepts. In Object-oriented Methods, Principles & Practice. Third edition. Harlow, England: Addison-Wesley; 2001:1–37.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.