OLS Dialog: An open-source front end to the Ontology Lookup Service
© Barsnes et al; licensee BioMed Central Ltd. 2010
Received: 14 October 2009
Accepted: 17 January 2010
Published: 17 January 2010
With the growing amount of biomedical data available in public databases it has become increasingly important to annotate data in a consistent way in order to allow easy access to this rich source of information. Annotating the data using controlled vocabulary terms and ontologies makes it much easier to compare and analyze data from different sources. However, finding the correct controlled vocabulary terms can sometimes be a difficult task for the end user annotating these data.
In order to facilitate the location of the correct term in the correct controlled vocabulary or ontology, the Ontology Lookup Service was created. However, using the Ontology Lookup Service as a web service is not always feasible, especially for researchers without bioinformatics support. We have therefore created a Java front end to the Ontology Lookup Service, called the OLS Dialog, which can be plugged into any application requiring the annotation of data using controlled vocabulary terms, making it possible to find and use controlled vocabulary terms without requiring any additional knowledge about web services or ontology formats.
As a user-friendly open source front end to the Ontology Lookup Service, the OLS Dialog makes it straightforward to include controlled vocabulary support in third-party tools, which ultimately makes the data even more valuable to the biomedical community.
The amount of biomedical data stored in public databases has grown extensively in the last couple of decades and will most likely continue to increase just as rapidly in the coming years . However, for researchers to make optimal use of this large amount of information, it has to be structured and annotated in such a way that data from different labs, different instruments and even of different types can be compared and analyzed efficiently. Data must therefore be annotated using precisely defined terms agreed upon by all data providers. With this requirement in mind, controlled vocabularies (CV) and ontologies have been created. A CV is defined as a limited list of clearly defined terms, with optional relationships between the terms, while an ontology moves beyond a mere CV by attempting to extensively model a part of the real world .
But even though systems for annotating biomedical data in consistent ways are available, finding and using the correct CV terms to annotate a data set may in some cases be a difficult task. Partly as a response to this the Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) was created [3, 4]. The OLS provides interactive and programmatic interfaces to query, browse and navigate a long list of biomedical ontologies, thus making it easier to find the desired CV terms. However, using the OLS as a web service is not always feasible, especially for researchers without bioinformatics support.
We have therefore created a Java front end to the OLS, called the OLS Dialog http://ols-dialog.googlecode.com, which can be plugged into any application requiring the annotation of data using CV terms, making it straightforward to find and use CV terms without requiring any additional knowledge about web services or ontology formats.
The OLS Dialog has been implemented in Java, is platform independent and requires Java 1.5 (or newer). As the name suggests, the OLS Dialog is implemented as a Java dialog, which depends on a parent frame or dialog. Selected terms are communicated to this parent through the OLSInputable interface, defined in the package no.uib.olsdialog. This interface contains two simple methods that fully represent the interaction of the OLS Dialog with its parent.
Platform independent Java binaries, additional documentation and source code are freely available at http://ols-dialog.googlecode.com. OLS Dialog is released under the permissive Apache2 license http://www.apache.org/licenses/LICENSE-2.0.html allowing for easy reuse of the code and tool in other settings.
Four different CV term search strategies are supported in the OLS Dialog: (i) Term Name Search, locates a CV term by a (partial) match to a search term; (ii) Term ID Search, locates a CV term by its CV term accession number; (iii) PSI-MOD Mass Search, finds the CV term for a modification in the PSI-MOD ontology  using the mass of the modification; and (iv) Browse Ontology, browses the ontology as a tree structure and allows the user to locate the desired term. Furthermore, OLS Dialog also provides a Term Hierarchy Graph view that can be used to locate or verify a CV term by inspecting the term hierarchy. Note that the Term Name Search supports both fuzzy/partial searches ('oxid' locates all partially matching CV terms, e.g., 'oxidation' and 'L-cystine S-oxide') and synonym searches (MOD:00045 can be found by searching for 'pros-phosphohistidine', 'phosphorylation', 'Npi-phosphorylated L-histidine', etc).
The main interface of the OLS Dialog is split into three main parts. At the top, the desired ontology is selected. At the time of writing more than 70 different biomedical ontologies are supported in the OLS, including over 900 000 CV terms. A full list of the supported ontologies can be found at http://www.ebi.ac.uk/ols. These ontologies are constantly updated and maintained by specialists in the various fields [6–8], and new changes will be automatically picked up daily by the OLS. It is important to note that the OLS Dialog does not store the ontologies locally but accesses the OLS web service whenever a search is performed. This ensures that the latest versions of the ontologies are always used.
In addition to searching a specific ontology it is also possible to search in all ontologies at once by selecting 'Search In All Ontologies' at the top of the list. This makes it possible to locate a CV term for which the ontology is unknown. Searching in all ontologies slows down the search however, and is not the recommended standard search option.
Below the ontology selection there are four tabs, one for each search option. Although each tab provides a search-specific interface, the overall structure stays the same. The search parameters are inserted or selected at the top of the tab, and the results of the search, i.e., the matching CV terms, are inserted into the 'Search Results' table. By selecting a CV term in the results table the term's associated details will be presented in the 'Term Details' table. The Browse Ontology tab is slightly different, as it replaces the 'Search Results' table with a tree structure of all terms in the currently selected ontology. It is also possible to view the term hierarchy as a graph by clicking the 'View Term Hierarchy' link at the top of the 'Term Details' text area. When a CV term is selected in the table (or in the tree) clicking the 'Use Selected Term' sends the selected term to the parent frame or dialog.
To display how the OLS Dialog can be used in other projects we have implemented a simple application, OLS_Example, located in the no.uib.olsdialog.example package. To run the example, download and unzip the OLS Dialog and double click the jar file (or run from the command line using 'java-jar ols-dialog-X.Y.Z', where X.Y.Z represents the version number of the software). More details can be found at the OLS Dialog web page: http://ols-dialog.googlecode.com.
The OLS Dialog greatly simplifies the usage of the OLS in end-user tools, without requiring any additional knowledge about web services or ontology formats, making it much easier to annotate data using CV terms. The OLS Dialog has already been in use for quite some time in PRIDE Converter  for annotating mass spectrometry data. We believe that many other tools could also benefit from the usage of the OLS Dialog, and that this could increase the usage of CV terms for annotating data, which ultimately makes these data even more valuable to the biomedical community.
Availability and requirements
Project name: OLS Dialog
Project home page: http://ols-dialog.googlecode.com
Operating system(s): Platform independent
Programming language: Java
Other requirements: Java 1.5 or newer
License: Apache License 2.0 http://www.apache.org/licenses/LICENSE-2.0.html
Any restrictions to use by non-academics: none
List of abbreviations
Ontology Lookup Service.
LM would like to thank Joël Vandekerckhove for his support. HB was supported by a regional grant in the FUGE programme from the Research Council of Norway.
- Vizcaíno JA, Côté R, Reisinger F, Foster JM, Mueller M, Rameseder J, Hermjakob H, Martens L: A guide to the Proteomics Identifications Database proteomics data repository. Proteomics 2009, 9: 4276–4283. 10.1002/pmic.200900402View ArticlePubMedPubMed CentralGoogle Scholar
- Martens L, Palazzi LM, Hermjakob H: Data standards and controlled vocabularies for proteomics. Methods Mol Biol 2008, 484: 279–286. full_textView ArticlePubMedGoogle Scholar
- Côté RG, Jones P, Apweiler R, Hermjakob H: The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries. BMC Bioinformatics 2006, 7: 97. 10.1186/1471-2105-7-97View ArticlePubMedPubMed CentralGoogle Scholar
- Côté RG, Jones P, Martens L, Apweiler R, Hermjakob H: The Ontology Lookup Service: more data and better tools for controlled vocabulary queries. Nucleic Acids Res 2008, (36 Web Server):W372–6. 10.1093/nar/gkn252
- Montecchi-Palazzi L, Beavis R, Binz PA, RJ C, Cottrell J, Creasy D, Shofstahl J, Seymour SL, Garavelli JS: The PSI-MOD community standard for representation of protein modification data. Nat Biotechnol 2008, 26: 864–866. 10.1038/nbt0808-864View ArticlePubMedGoogle Scholar
- Barrell D, Dimmer E, Huntley RP, Binns D, O'Donovan C, Apweiler R: The GOA database in 2009--an integrated Gene Ontology Annotation resource. Nucleic Acids Res 2009, (37 Database):D396–403. 10.1093/nar/gkn803
- Orchard S, Hermjakob H: The HUPO proteomics standards initiative--easing communication and minimizing data loss in a changing world. Brief Bioinform 2008, 9: 166–173. 10.1093/bib/bbm061View ArticlePubMedGoogle Scholar
- Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, et al.: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res 2008, (36 Database):D1009–14.
- Barsnes H, Vizcaíno JA, Eidhammer I, Martens L: PRIDE Converter: making proteomics data-sharing easy. Nat Biotechnol 2009, 27: 598–599. 10.1038/nbt0709-598View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.