Features
The BioResource Navigator is the central component in Bioclipse that allows for navigating BioResources in a hierarchical, tree-like View, similar to browsing local folders and files (Figure 3). It contains basic features such as cut and paste, drag and drop, and wizards for new resources, and provides an extension point where plugins can register actions for appearing in the context menu upon right-click. A basic text-editor and an XML-editor with global actions such as undo, redo, cut, and paste are also provided. Bioclipse further contains a Properties View for visualization of properties of the currently selected objects in the workbench, a console that echoes messages back to the user, and a job scheduler that allows time-consuming tasks to be run in the background and displayed upon completion (Figure 3). Bioclipse also contains various wizards for the creation of new BioResources, global preferences for customizing the workbench, and a searchable, XML-based help-system that ensures the user manual is readily available – all with extension points so that external plugins can make additions to every part.
Chemoinformatics
The Chemoinformatics Perspective is a set of Views, Editors, and Menus for molecular management and analysis (Figure 3). Structures are the main data type scientists encounter in chemistry-related fields, and the Chemoinformatics plugins add functionality to Bioclipse that describe chemical structures in various ways.
The CDK-plugin integrates the Chemistry Development Kit (CDK) [13, 14] library into Bioclipse, and also extends the platform with several graphical components. CDK is a freely available open-source library of Java classes for chemo- and bioinformatics, computational chemistry, and chemometrics. It provides methods for many common tasks in molecular informatics, including 2D and 3D rendering of chemical structures, I/O routines for different chemical file formats, SMILES parsing and generation, QSAR descriptor calculation, atom typing, ring searches, isomorphism checking, and structure diagram generation. The CDK data model for chemical structures is used over the whole platform as an internal data structure for the representation of any kind of molecular data. Bioclipse makes use of the CDK I/O functionality and is capable of writing and reading the same formats for chemical structure information as the CDK itself, which currently are XYZ, MDL molfile, PDB and CML [13, 15]. The CDK-plugin adds two views to the bioclipse framework: The ChemTreeView which gives a hierarchical visualization of the CDK data model, and the Structure2DView which displays 2D-Structures.
The JChemPaint-plugin provides 2D-editing by wrapping around the JChemPaint editor for 2D molecular structures. JChemPaint is open source, freely available under the LGPL license (GNU Lesser General Public License), completely written in Java, and developed by an international team of developers [16]. The JChemPaint editor is used as the main editor for chemical structures in Bioclipse (Figure 2). It is a Multi-Page Editor which shows two tabs with different views on the same object; The first tab (JChemPaint) displays the structure in 2D and the second (source) shows the molecular data in its original file format. The two tabs are synchronized with each other so that changes in one tab are immediately reflected in the other. The Toolbar and Menu of JChemPaint are directly integrated with the Bioclipse tool- and menu bar. The plugin has the same feature list as the standalone JChemPaint application, including drawing of bonds and atoms, selection of ring templates, flipping and rotating of selected parts of a molecule, undo/redo functionality, and stereo descriptors.
3D-visualization is provided by the Jmol-plugin, wrapping the open source tool Jmol [17] to provide advanced visualization options for molecules and proteins (Figure 4). Jmol includes a scripting language, and Bioclipse offers a console to enter such scripting commands. An Editor for Jmol-scripts is also included that supports editing with code completion and syntax highlighting, as well as the execution of scripts.
The CML-plugin provides access to the Jumbo CML (Chemical Markup Language) library, an open-source Java library for handling and representing CML Documents or -data structures [18]. CML is an XML-implementation for chemical data/information and an extensible basis for chemically aware markup languages. It is structured in a modular way by a core part and several extending components. CML shares all general XML features and advantages, such as data- and not presentation centric, simultaneously human- and machine readable, platform independence, and the ability to represent most general data structures. In Bioclipse, CML is used for the internal representation of spectrum data and for the import and export of structures and spectra to and from the CML file format. Additionally there is an implementation of a CML validation plugin, which checks a given CML file against the CML schema and outputs any detected errors and warnings to the user.
The CMLRSS-plugin provides tools for CML-enriched news and blog feeds, supporting the RSS 1.0, RSS 2.0, Atom 0.3 and Atom 1.0 formats [19]. The Bioclipse CMLRSS View automatically extracts CML in the feeds, and resources can directly be visualized and manipulated in Bioclipse. This creates easy access to chemical information published on the web and in databases.
Bioinformatics
The Bioinformatics Perspective is a collection of Views, Editors, and Menus for loading, parsing, visualizing, editing, converting, and saving sequences/proteins in various formats (Figure 4). Sequence management is provided by BioJava [20], an open-source framework for processing biological data including methods for manipulating biological sequences, file parsers, biological databases, and data analysis routines. A Sequence Viewer can visualize sequences along with SwissProt features. For 3D-visualization of macromolecules, Jmol is also utilized in the Bioinformatics Perspective.
Web services
A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. However, the term usually refers to services that use SOAP-formatted [21] XML envelopes and have their interfaces described by the Web Services Description Language (WSDL) [22]. It is becoming increasingly popular for organizations and companies in bioscience to offer such services to provide data access to a public repository, or to invoke remote procedures on a networked computer [23, 24].
Bioclipse is equipped with a plugin that allows Web services to be easily integrated into the workbench. The first implementation was the WSDbfetch Web service at the European Bioinformatics Institute, which can return entries from various biological databases [25]. Bioclipse contains a wizard for this service that enables the user to retrieve entries such as PDB-files and sequences in various formats. The retrieved data is then stored in a virtual folder in the BioResource Navigator, parsed and treated as any other BioResource. In the case of an unknown data format, the data is stored as plain text.
Spectrum analysis
Compound identification, structure elucidation, and purity control are common tasks in chemistry and biology. Computers can greatly assist in these processes by providing methods for the collection, organization, normalization, and analysis of the data obtained [26]. The Spectrum-plugin provides various graphical and non-graphical tools and methods for spectrum visualization, analysis, and manipulation. The plugin contributes the Spectrum perspective, which is mainly formed by three different views with dedicated methods/actions:
-
The Spectrum chart views, which use the JFreeChart package [27] for visualization of spectral information (either peak or continuous data). Step-less zoom in/out of the spectrum is possible, as well as setting display properties via the context menu.
-
The Metadata View, which displays the stored spectrum meta data in an editable format.
-
The PeakTable View, which displays existing peaks in an editable table and gives the user the ability to add, edit, and delete peaks.
The Spectrum plugin comes with routines for importing and exporting data in the CML and JCAMP-DX format, as well as a wizard for the creation of new resources in both formats [18, 28]. If continuous data exists for a spectrum, a peak picking action is available for automatic extraction into a peak spectrum. Methods for helping the user with the interpretation of spectra, like calculation of integrals, and comparative views to simplify the direct comparison of different spectra, will be included into the spectrum plugin in a future version. An additional plugin for the assignment of structural to spectral data and vice versa is already in development.
Scripting
Bioclipse includes a plugin for creating scripts based on the Mozilla Rhino engine [29]. Rhino is an open-source implementation of JavaScript which is embedded into Java applications to provide scripting to end users. The plugin allows for automation of tasks and creation of new functionality by creating scripts that are able to interact with the GUI, the object model, and features of all installed plugins. Bioclipse is not limited to only one scripting language, and we expect others to be integrated in future releases.
Sample data
Bioclipse comes with a plugin for installing sample data including molecules, proteins, sequences, spectra, and scripts in various file formats. Another plugin containing many different organic chemical structures is also included. Installation actions for the available data collections are accessible from the main menu.