Eval: A software package for analysis of genome annotations
© Keibler and Brent; licensee BioMed Central Ltd. 2003
Received: 18 July 2003
Accepted: 17 October 2003
Published: 17 October 2003
Eval is a flexible tool for analyzing the performance of gene annotation systems. It provides summaries and graphical distributions for many descriptive statistics about any set of annotations, regardless of their source. It also compares sets of predictions to standard annotations and to one another. Input is in the standard Gene Transfer Format (GTF). Eval can be run interactively or via the command line, in which case output options include easily parsable tab-delimited files.
To obtain the module package with documentation, go to http://genes.cse.wustl.edu/ and follow links for Resources, then Software. Please contact firstname.lastname@example.org
Automated gene annotation systems are typically based on large, complex probability models with thousands of parameters. Changing these parameters can change a system's performance as measured by the accuracy with which it reproduces the exons and gene structures in a standard annotation. While traditional sensitivity and specificity measures convey the accuracy of gene predictions [1, 2], more information is often required for gaining insight into why a system is performing well or poorly. A deep analysis requires considering many features of a prediction set and its relation to the standard set, such as the distribution of number of exons per gene, the distribution of predicted exon lengths, and accuracy as a function of GC percentage. Such statistics can reveal which parameter sets are working well and which need tuning. We are not aware of any publicly available software systems that have this functionality. We therefore developed the Eval system to support detailed analysis and comparison of the large data sets generated by automated gene annotation systems [e.g., ].
A sampling of the less common statistics calculated by Eval when comparing the output of TWINSCAN and GENSCAN on the "semi-artificial" gene set used in  to the gold standard annotation. Standard statistics such as gene and exon sensitivity and specificity are also calculated but are not shown.
Exons Per Transcript
CDS Overlap Specificity
CDS Overlap Sensitivity
All Introns Matched Specificity
All Introns Matched Sensitivity
Start and Stop Codon Specificity
Start and Stop Codon Sensitivity
5' Splice Specificity
5' Splice Sensitivity
80% Overlap Specificity
80% Overlap Sensitivity
Multi-way comparisons (Venn diagrams)
The results of building a Venn diagram based on exact exon matches among the aligned RefSeqs, TWINSCAN 1.2 predictions, and GENSCAN predictions, on the NCBI34 build of the human genome. All exons are first combined into clusters that have the same begin and end points. These clusters are then partitioned into the subset of exons annotated only by RefSeq (R), the subset annotated only by TWINSCAN (T), the subset annotated only by GENSCAN (G), the subset annotated by RefSeq and TWINSCAN but not GENSCAN (RT), etc. For each of these subsets, the table shows the number of clusters in the subset. It also shows the percentage all exons from each of the input sets that is included in that subset. The last column shows the fraction of all clusters included in that subset.
Subset in partition
% of RefSeq exons
% of Twinscan exons
% of Genscan exons
% of all clusters
Extraction of subsets
Eval can also extract subsets of genes that meet specific criteria for further analysis. Sets of genes that match another gene set by any of the following criteria can be selected: exact match, genomic overlap, CDS overlap, all introns match, one or more introns match, one or more exons match, start codon match, stop codon match, start and stop codon match. Boolean combinations of these criteria can also be specified. For example, the set of RefSeq genes that are predicted correctly by System1 but not by System2 can be extracted from annotations of the entire human genome with just a few commands. Once extracted, gene sets can be inspected individually using standard visualization tools.
Eval is written in Perl and uses the Tk Perl module for displaying its graphical user interface. It is intended to run on Linux based systems, although it also runs under Windows. It requires the gnuplot utility to display the graphs it produces, but it can create the graphs as text files without this utility. The package comes with both command line and graphical interface. The command line interface provides quick access to the functions, while the graphical interface provides easier, more efficient access when running multiple analyses on the same data sets.
Annotations are submitted to Eval in GTF file format http://genes.cse.wustl.edu/GTF2.html, a community standard developed in the course of several collaborative genome annotations projects [6, 7]. As such it can be run on the output of any annotation system. The Eval package contains a GTF validator which verifies correct GTF file format and identifies common syntactic and semantic errors in annotation files. It also contains Perl libraries for parsing, storing, accessing, and modifying GTF files and comparing sets of GTF files.
Although it is written in Perl, the Eval system runs relatively quickly. A standard Eval report comparing all TWINSCAN [3, 4] genes predicted on the human genome to the aligned human RefSeqs processes ~40,000 transcripts and ~300,000 exons and completes in under five minutes on a machine with a 1.5 GHz Athlon processor and 2 GB of RAM.
This work was supported in part by grant DBI-0091270 from the National Science foundation to MRB and grant HG02278 from the National Institutes of Health to MRB.
- Guigó R, Agarwal P, Abril JF, Burset M, Fickett JW: An assessment of gene prediction accuracy in large DNA sequences. Genome Res 2000, 10: 1631–1642. 10.1101/gr.122800PubMed CentralView ArticlePubMedGoogle Scholar
- Burset M, Guigo R: Evaluation of gene structure prediction programs. Genomics 1996, 34: 353–367. 10.1006/geno.1996.0298View ArticlePubMedGoogle Scholar
- Flicek P, Keibler E, Hu Ping, Korf Ian, Brent Michael R.: Leveraging the mouse genome for gene prediction in human: From whole-genome shotgun reads to a global synteny map. Genome Res 2003, 13: 46–54. 10.1101/gr.830003PubMed CentralView ArticlePubMedGoogle Scholar
- Korf I, Flicek P, Duan D, Brent MR: Integrating genomic homology into gene structure prediction. Bioinformatics 2001, 17 Suppl 1: S140–8.View ArticlePubMedGoogle Scholar
- Burge C, Karlin S: Prediction of complete gene structures in human genomic DNA. J Mol Biol 1997, 268: 78–94. 10.1006/jmbi.1997.0951View ArticlePubMedGoogle Scholar
- Reese MG, Hartzell G, Harris NL, Ohler U, Abril JF, Lewis SE: Genome annotation assessment in Drosophila melanogaster. Genome Res 2000, 10: 483–501. 10.1101/gr.10.4.483PubMed CentralView ArticlePubMedGoogle Scholar
- Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 2002, 420: 520–562. 10.1038/nature01262View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.