- Methodology article
- Open Access
AIGO: Towards a unified framework for the Analysis and the Inter-comparison of GO functional annotations
© Defoin-Platel et al; licensee BioMed Central Ltd. 2011
- Received: 20 May 2011
- Accepted: 3 November 2011
- Published: 3 November 2011
In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall.
In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo) for the Analysis and the Inter-comparison of the products of Gene Ontology (GO) annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats.
This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.
- Gene Ontology
- Functional Annotation
- Semantic Similarity
- Annotation Pipeline
- Evidence Code
Given the advent of high-throughput sequencing technologies and the resulting data explosion, there is an urgent requirement to provide electronically inferred functional annotations (FAs) for proteins whose biological functions have not yet been determined by experimental investigation. This has given rise to a large number of annotation inference pipelines [1–6] which are used routinely to produce new annotations that are stored in a variety of publicly accessible databases. Therefore, when looking for FAs related to a particular organism, it is commonplace to find several resources each providing annotations that have been predicted using different pipelines, at different times and under different conditions. Consequently, end-users of FAs have to decide between multiple resources. However, since there is no single accessible and objective way to compare these FAs, these decisions are in many cases made arbitrarily, whereas different biological problems may require consideration of annotations from different points of view.
Where pipeline outputs reference a controlled vocabulary or ontology, relative comparisons of these outputs can be made by using a reference protein-set as a common input. Many pipelines endeavour to annotate gene products to the three aspects of the Gene Ontology (GO): molecular function (MF), biological process (BP), and cellular component (CC). For each GO aspect, a given gene product can therefore be annotated with a set of GO terms. GO provides not only a unified and shared controlled vocabulary for annotation but a structured representation of knowledge. The power of this annotation system over a flat controlled vocabulary is twofold. Firstly, it can be used to evaluate the functional distance between genes  by quantifying the difference between their annotation sets based on the semantics codified in the structure. Secondly, it defines a hierarchy between the GO terms that leads to annotations being more or less specific, i.e. more or less biologically informative.
The aim of this study is to define and implement an ensemble of measures that provide both a quantitative and a qualitative description of GO FAs and also allows a comprehensive inter-comparison between them. So far, a plethora of such measures have been proposed but no attempts have yet been made to define and implement a unified framework where FAs can be compared in a systematic way. A notable exception is the Blast2GO suite  where several metrics describing FAs are implemented together, for example to measure the specificity of annotations or to automatically compare groups of GO terms. However in Blast2GO, these metrics are only used to measure specific aspects of Blast2GO performance and furthermore, the Blast2GO suite is not designed to compare FAs issued by other pipelines. In most studies, the different properties of FAs are implemented and studied independently e.g. [9, 10]. In addition to facilitating the comparison of annotation sets produced by various annotation pipelines, having a common unified framework that implements these metrics would assist in the evaluation of new metrics, the exploration of the influence of parameters in annotation pipelines and the monitoring of annotations over time.
In this paper, we propose a set of twelve metrics and implement them in a Python open source library named AIGO for the Analysis and the Inter-comparison of GO functional annotations. We demonstrate the use of the AIGO library for evaluating and comparing the output of annotation pipelines using three illustrative examples.
Features of functional annotations described by AIGO metrics
Feature of a functional annotation
Proportion of genes annotated by a FA
Proportion of GO terms used by a FA
Nb. of annotations
Number of annotations of a given gene
Similarity of the terms in a set of annotations
Similarity of the annotation sets in a FA
Specificity of the annotations for a given gene
Informativeness of the annotations for a given gene
Proportion of redundant terms in a set of annotations
Proportion of obsolete terms in a FA
Degree of functional relatedness between two annotation sets
Proportion of the annotations in a given FA being also found in a gold-standard
Proportion of the annotations in the gold-standard being also found in a given FA
Definitions of metrics
For any given organism Ω, we call functional annotation, noted ℱ, the mapping between a set of gene products GΩ from this organism and some functional data. In practice, a set of gene products from an organism can be defined in various ways, ranging from a comprehensive list of proteins (if this organism has been well studied) to a list of ESTs assembled into potentially incomplete or inaccurate UniGenes (if the organism has not yet been sequenced), see . A common scenario is that the set of gene products corresponds to some defined reference set: for example the list of target sequences from a microarray. The functional data are usually provided as annotation sets containing terms from controlled vocabularies or reference ontologies. We note a given ontology, defined as a direct acyclic graph connecting a set of terms . In this study, can be any of the three aspects of GO (i.e. BP, MF and CC). More formally, a functional annotation of a set of gene products GΩ over a given ontology , is a map that associates with each gene product g ∈ GΩ a subset of .
A. Quantity of annotations
B. Diversity of annotations
C. Informativeness of annotations
D. Inter-comparison of functional annotations
It is possible to obtain a general comparison of several FAs by contrasting the values they report for the nine metrics described above. However, a deeper examination is sometimes required, for example, to know how many gene products are commonly annotated by two or more FAs, what these gene products are, and how different/similar their annotation sets are within the FAs in question. In this section, we describe three metrics for the inter-comparison of FAs.
The AIGO library will generate and display Venn diagrams to compare the coverage of FAs. When gene products are annotated by at least two FAs, AIGO can compute the semantic similarity between their sets of annotations. This functionality is particularly helpful for identifying highly dissimilar annotation sets and therefore has the potential to detect errors in FAs. Numerous metrics have been proposed to evaluate the semantic similarity between annotation sets. In the AIGO library, three semantic similarity metrics are implemented: (i) the Resnik metric based on the most informative common ancestors of GO terms , (ii) a graph-based metric known as the Czekanowski-Dice similarity (c.f.  to see how this can be used in the context of the Gene Ontology) and (iii) another graph-based metric called GS2  which behaves similarly to the measure developed by Wang et al. in  but is more computationally efficient since it can measure the similarity of a set of genes in linear time in the size of the set. Because of its greater efficiency, the GS2 metric has been used for all the semantic similarity calculations in this study.
Implementation: the AIGO library
The different metrics presented in the previous section have been implemented in a Python library, named AIGO, for the Analysis and the Inter-comparison of GO functional annotations. AIGO is an open source library distributed under the GNU General Public License v3 and source code, documentation and illustrative examples can be found here: http://code.google.com/p/aigo. AIGO is an object oriented Python library implementing a collection of classes to load, manipulate, analyse and inter-compare FAs. The library can read annotation files in various formats such as "GO Annotation File 2.0", Affymetrix annotation files (TAF format) and generic mapping files. Additionally, AIGO can read GO files in xml OBO format and gene set reference files in different formats, including the FASTA format. AIGO can automatically create graphics showing the results of the comparison of multiple FAs, including the distribution of the metrics and draw annotated directed acyclic graphs of GO. All the results can also be exported into simple text files or into Microsoft Excel files. To assess the statistical significance of the results, AIGO also provides classes to perform permutation tests by randomizing existing FAs or by sampling FAs directly from GO.
Most of the operations that AIGO performs are not computationally intensive and therefore do not require much CPU time or memory. For example, the first case study presented in the Results section took exactly 2 minutes to run on a dual core CPU (2.8 Ghz). In practice, the computation of most of the statistics is very fast, with the exception of the those involving semantic similarity. The GS2 metric is very scalable, since it is computable in linear time in the size of the set of gene products, however computing the Coherence and the Compactness statistics can still take several minutes. Another rather demanding operation is plotting the direct acyclic graph induced by a set of annotations: more precisely, the layout of the nodes can take one hour in the worst case.
Example 1: Inter-comparison of annotation pipelines
Properties of Bovine functional annotations
GO Biol ogical Process
Avg. nb annot.
GO Molecular Function
Avg. nb annot.
GO Cellu lar Component
Avg. nb annot.
All three aspects of GO
Avg. nb annot.
Analysis of Functional Annotations
AID has the greatest breadth of coverage, with a majority of target sequences (60.4%) having at least one annotation from at least one of the three aspects of GO, whereas B2G systematically reports the lowest coverage results. Unexpectedly, despite this smaller coverage, B2G contains more diverse GO terms than the other FAs, with richness (i.e. the proportion of GO terms assigned to at least one sequence) almost equal to 25% (all aspects of GO). The compactness statistic (which is a measure of the diversity of the annotation sets) is similar across the three FAs; this indicates that their annotation sets are distributed in a similar way across the ontology.
Being able to easily identify and display the annotation sets, where the sizes and/or coherence statistics highlight the presence of extreme values, is a useful feature of AIGO that should help developers of functional annotation pipelines and/or end-users of FAs to detect the presence of abnormalities.
The last three metrics reported in Table 2 are related to the informativeness of the annotations. Both the specificity and the information content reveal that B2G annotations sets are more informative on average than the two other sets of FAs. We also note the large quantity of redundant annotations present in AFFY and AID annotations sets. For these two FAs, almost 30% of the annotations could potentially be removed from the annotation sets, since they correspond to GO terms which are more generic than the existing terms already present in the same sets. It is however important to note that the GO terms contained in these annotation sets might have been assigned using different techniques with differing levels of confidence. Therefore, when an annotation to a GO term has a stronger confidence than another annotation to a more specific term, it would still make sense to keep both of them together in an annotation set. This may explain the high level of redundancy observed in AFFY and AID.
Inter-comparison of Functional Annotations
Example 2: Comparison of evidence codes
The gene association files, which are made available for multiple species by the Gene Ontology project , provide mappings of genes to GO terms. For each of these mappings, an evidence code indicates the source from which the association of a given gene with a particular GO term was derived . Only one of these evidence codes, namely "Inferred from Electronic Annotation", identifies annotations that have been made with no manual intervention from a human curator. In this section we use AIGO to compare these electronic annotations to annotations assigned by a human curator, corresponding to the following evidence codes: "Traceable Author Statement" (TAS), "Non-traceable Author Statement" (NAS), "Inferred from sequence or structural similarity" (ISS) and "Inferred by Curator" (IC).
Hierarchical Precision of Evidence Codes
Hierarchical Recall of Evidence Codes
Two important factors influencing these statistics are: for hRec, the number of EXP2 annotations to be retrieved and, for hPrec, the number of annotations assigned by AHC and IEA. AIGO reveals that the average number of annotations per gene product varies with the GO aspect for the three FAs with: 1.1, 1.15 and 1.2 annotations on average for EXP2 (for respectively CC, MF and BP), 1.67, 1.84 and 4.99 annotations for AHC and 3.56, 4.88 and 6.05 annotations for IEA. The variation reported in the number of annotations for EXP2 explains the trend observed in hRec which tends to be smaller for BP than MF and smaller for MF than CC, whereas the variations in AHC and IEA explain the trend for hPrec which tends to be smaller for BP than MF and smaller for MF than CC. The relatively low hPrec value reported for BP by AHC in Table 3 corresponds to an outstandingly high number of annotations predicted for this category: nearly five annotations per gene product on average, which is more than twice the numbers reported for MF and CC. We note that even though we are confident that EXP2 contains only correct annotations, it is improbable that all the annotations that could be assigned to the gene products are present in EXP2. Hence, the results reported for the hPrec metric should be interpreted with care, as incompleteness in the gold-standard may introduce bias into the comparison between FAs.
As for the hRec metric, the overall results for AHC and IEA are almost identical, except for MF where IEA performs better. To explore this important difference between IEA and AHC, we looked at the various statistics produced by AIGO for MF. While the specificity (mean number of ancestors over all annotated terms) is comparable for AHC and IEA (6.89 and 6.65 respectively), it is apparent that fewer GO terms are used in AHC, compared to IEA and EXP2 (richness is 2.82%, 5.26% and 7.97% respectively). The distribution of hRec values for AHC displays a peak at 0.2, which is not present for IEA (data not shown). We have manually inspected these cases and found that generally the annotations supported by AHC are also supported by IEA, and are therefore potentially correct. They also tend to correspond to terms from different parts of GO than EXP2 as a whole. We do not further explore this observation in this work but these results reflect the fact that certain parts of the ontology are favoured during the curation process. Indeed, manual associations between GO terms and gene products are often produced as part of manual annotation projects. These curation initiatives tend to have very specific targets such as the annotation of genes related to precise biological functions  or to particular diseases .
Example 3: Monitoring the evolution of functional annotations
The AIGO library can also be used to monitor the evolution of FAs across time. As an illustration, AIGO was used to study 11 FAs for a rice genome microarray (see Additional file 1 - Reference of Affymetrix GeneChip genome arrays) released by Affymetrix  between 31 May 2007 and 8 November 2010, and numbered from 20 to 31. The set of reference gene products correspond to the 57,381 target sequences of the rice array.
Biologists and bioinformaticians, as end-users of functional annotations (FAs), are generally confronted with a choice between several available FAs for the organisms they are studying. Often the most important thing to consider when deciding between alternative FAs is to look at the accuracy of the annotations: i.e. to what extent do the annotations of some proteins really correspond to what has been experimentally established about the function of these proteins? In practice this evaluation is very challenging, given that for the vast majority of the organisms there is no existing gold-standard to evaluate precision and recall against. Furthermore, even in the rare cases where an organism has been extensively studied and the functions of most of its proteins experimentally assessed and annotated in public databases, there is no guarantee that all the biological roles and functions of these proteins are known and translated into functional annotations. This situation is problematic when evaluating the precision of FAs, but is not unique to the study of protein function since the lack of negative gold-standards and its impact on the evaluation of automatic inference methods has been reported in other domains of bioinformatics .
An alternative approach to decide between FAs would be to directly compare the functional annotation pipelines that produced these FAs. However, these pipelines differ widely in their inference algorithms, confidence thresholds, and data sources for reasoning and this makes the comparison of the relative merits of each approach extremely complex.
The more pragmatic approach developed in this paper is to analyse and inter-compare the products of these functional annotation pipelines, that is the FAs themselves. FAs are complex multi-dimensional objects that cannot be summarized in a straightforward way. Hence, we defined the AIGO framework to describe various features of FAs that we consider being important for FAs end-users and more generally for the community of biologists and bioinformaticians interested in gene or protein functions. So far a set of twelve metrics, nine for the analysis and three for the inter-comparison of FAs, has been defined and implemented in AIGO. Each metric covers only one particular aspect of the FAs but, as demonstrated in this paper, when computed all together and interpreted in the AIGO framework, these metrics provide a global view of the FAs and help to make informed decisions concerning them. Alternatively, after having inspected a set of FAs in AIGO, one might want to combine them instead of retaining only one. Therefore, the library also provides functionalities to compute the union or the intersection of FAs. The union of FAs is, for example, particularly meaningful in the case where each FA covers a different aspect of GO, e.g. to combine a FA containing only GO Molecular Function terms converted from Enzyme Commission numbers with a FA containing only GO Cellular Component terms converted from subcellular location predictions. Conversely, the intersection of FAs is more appropriate to increase the reliability of functional annotations. For example, when an ensemble of FAs provides annotations about the same organism but using different types of evidence (e.g., sequence or structural comparisons, analysis of gene coexpression patterns, etc.) selecting the associations between gene products and GO terms that are present in all these FAs is a straightforward way to gain confidence in the annotations.
As well as FA end-users, we believe that AIGO can be beneficial to other categories of members of the scientific community. An important category of members corresponds to FA producers, i.e. people developing new functional annotation pipelines, who might want to compare their pipeline outputs to existing FAs, or measure the influence of a given pipeline parameter on their results. Another category is the providers of FAs who are responsible for releasing FAs to the community, and who need periodically to monitor the content of automatically produced FAs, for example, to detect spurious annotations or abnormally large annotation sets. A further category corresponds to researchers working on new metrics to describe FAs. The AIGO framework has been designed to be flexible so that new metrics on GO functional annotations can be easily added. By making this project open source and collaborative, we would like to encourage researchers to implement new metrics in AIGO and contribute to its development.
The work presented in this paper is a first step towards the development of a unified framework for the analysis and the inter-comparison of GO functional annotations.
The utility of the framework is demonstrated on three case studies. In the first, publicly available functional annotations are compared, and their differences highlighted, for example, in terms of the number of gene products annotated, or the number and specificity of the GO terms employed. In the second case study, the quality of two functional annotations, one obtained by computational methods and one corresponding to a manual curation process, is assessed using an approximated gold-standard. In the last example, we show how AIGO framework can be used to monitor the evolution of functional annotations by comparing different releases over time, in order to detect major variations, or to identify potentially incorrect annotations.
The authors gratefully acknowledge funding from the UK Biotechnology and Biological Sciences Research Council (BBSRC). AL was supported by PhD studentship BBS/S/E/2006/13205. MS and CJR were partially supported by the Ondex BBSRC SABR Grant BB/F006039/1. MMH gratefully acknowledges support from the Lawes Agricultural Trust. Rothamsted Research receives grant in aid from the BBSRC which supported MD-P, MS, CJR, SJP and DZH.
- Schmid R, Blaxter ML: annot8r: GO, EC and KEGG annotation of EST datasets. BMC Bioinformatics 2008, 9: 180. 10.1186/1471-2105-9-180PubMed CentralView ArticlePubMedGoogle Scholar
- Huang DW, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA: The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biology 2007, 8(9):R183. 10.1186/gb-2007-8-9-r183PubMed CentralView ArticlePubMedGoogle Scholar
- Koski LB, Gray MW, Lang BF, Burger G: AutoFACT: An (Auto)matic (F)unctional (A)nnotation and (C)lassification (T)ool. BMC Bioinformatics 2005, 6: 151. 10.1186/1471-2105-6-151PubMed CentralView ArticlePubMedGoogle Scholar
- Conesa A, Götz S, Garcia-Gomez JM, Terol J, Talon M, Robles M: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005, 21(18):3674–3676. 10.1093/bioinformatics/bti610View ArticlePubMedGoogle Scholar
- Martin DMA, Berriman M, Barton GJ: GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes. BMC Bioinformatics 2004, 5: 178. 10.1186/1471-2105-5-178PubMed CentralView ArticlePubMedGoogle Scholar
- Liu GY, Loraine AE, Shigeta R, Cline M, Cheng J, Valmeekam V, Sun S, Kulp D, Siani-Rose MA: NetAffx: Affymetrix probesets and annotations. Nucleic Acids Research 2003, 31(1):82–86. 10.1093/nar/gkg121PubMed CentralView ArticlePubMedGoogle Scholar
- The Gene Ontology Consortium: Creating the gene ontology resource: design and implementation. Genome Research 2001, 11(8):1425–1433. 10.1101/gr.180801View ArticleGoogle Scholar
- Conesa A, Götz S: Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics. International journal of plant genomics 2008, 2008: 619832–619832.PubMed CentralView ArticlePubMedGoogle Scholar
- Chagoyen M, Carazo JM, Pascual-Montano A: Assessment of protein set coherence using functional annotations. BMC Bioinformatics 2008, 9: 444. 10.1186/1471-2105-9-444PubMed CentralView ArticlePubMedGoogle Scholar
- Buza TJ, McCarthy FM, Wang N, Bridges SM, Burgess SC: Gene Ontology annotation quality analysis in model eukaryotes. Nucleic Acids Research 2008, 36(2):e12.PubMed CentralView ArticlePubMedGoogle Scholar
- Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius JU, Schuler GD, Schriml LM, Sequeira E, Tatusova TA, Wagner L: Database resources of the National Center for Biotechnology. Nucleic Acids Research 2003, 31(1):28–33. 10.1093/nar/gkg033PubMed CentralView ArticlePubMedGoogle Scholar
- Guo X, Liu RX, Shriver CD, Hu H, Liebman MN: Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics 2006, 22(8):967–973. 10.1093/bioinformatics/btl042View ArticlePubMedGoogle Scholar
- Pesquita C, Faria D, Bastos H, Ferreira AE, Falcao AO, Couto FM: Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 2008, 9(Suppl 5):S4. 10.1186/1471-2105-9-S5-S4PubMed CentralView ArticlePubMedGoogle Scholar
- Gene Ontology Consortium: The Gene Ontology's Reference Genome Project: a unified framework for functional annotation across species. PLoS Computational Biology 2009, 5(7):e1000431. 10.1371/journal.pcbi.1000431View ArticleGoogle Scholar
- Shannon CE: A Mathematical Theory of Communication. Bell System Technical Journal 1948, 27(3):379–423.View ArticleGoogle Scholar
- Resnik P: Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence 1995, 1: 448–453.Google Scholar
- Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, Jacq B: Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biology 2003, 5(1):R6. 10.1186/gb-2003-5-1-r6PubMed CentralView ArticlePubMedGoogle Scholar
- Ruths T, Ruths D, Nakhleh L: GS2: an efficiently computable measure of GO-based similarity of gene sets. Bioinformatics (Oxford, England) 2009, 25(9):1178–1184. 10.1093/bioinformatics/btp128View ArticleGoogle Scholar
- Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF: A new method to measure the semantic similarity of GO terms. Bioinformatics (Oxford, England) 2007, 23(10):1274–1281. 10.1093/bioinformatics/btm087View ArticleGoogle Scholar
- Verspoor K, Cohn J, Mniszewski S, Joslyn C: A categorization approach to automated ontological function annotation. Protein Science 2006, 15(6):1544–1549. 10.1110/ps.062184006PubMed CentralView ArticlePubMedGoogle Scholar
- Götz S, Arnold R, Sebastián-León P, Martín-Rodríguez S, Tischler P, Jehl MA, Dopazo J, Rattei T, Conesa A: B2G-FAR, a species centered GO annotation repository. Bioinformatics 2011, 27(7):919–924. 10.1093/bioinformatics/btr059PubMed CentralView ArticlePubMedGoogle Scholar
- van den Berg BH, Konieczka JH, McCarthy FM, Burgess SC: ArrayIDer: automated structural re-annotation pipeline for DNA microarrays. BMC Bioinformatics 2009, 10: 30. 10.1186/1471-2105-10-30PubMed CentralView ArticlePubMedGoogle Scholar
- Guide to GO Evidence Codes[http://www.geneontology.org/GO.evidence.shtml]
- Pal D, Eisenberg D: Inference of Protein Function from Protein Structure. Structure 2005, 13(1):121–130. 10.1016/j.str.2004.10.015View ArticlePubMedGoogle Scholar
- Verspoor K, Cohn J, Mniszewski S, Joslyn C: A categorization approach to automated ontological function annotation. Protein Sci 2006, 15(6):1544–1549. 10.1110/ps.062184006PubMed CentralView ArticlePubMedGoogle Scholar
- Alam-Faruque Y, Dimmer EC, Huntley RP, O'Donovan C, Scambler P, Apweiler R: The Renal Gene Ontology Annotation Initiative. Organogenesis 2010, 6(2):71–75. 10.4161/org.6.2.11294PubMed CentralView ArticlePubMedGoogle Scholar
- Khodiyar VK, Hill DP, Howe D, Berardini TZ, Tweedie S, Talmud PJ, Breckenridge R, Bhattarcharya S, Riley P, Scambler P, Lovering RC: The representation of heart development in the gene ontology. Developmental Biology 2011, 354(1):9–17. 10.1016/j.ydbio.2011.03.011PubMed CentralView ArticlePubMedGoogle Scholar
- Jansen R, Gerstein M: Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Current opinion in microbiology 2004, 7(5):535–545. 10.1016/j.mib.2004.08.012View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.