bioWeb3D allows the user to represent any 3D dataset on their browser by defining only two files. The two files can either be formatted as JSON or XML files, two widely used structured formats on the web [11, 12], or directly as Comma Separated Values files (CSV).
The first file used by the application, referred to as the “dataset file”, contains the coordinates of every point in the dataset. The second type of file used, the “information layer” file, describes one or several information layers that are associated with the points defined in the first file. For example, if each point defines the location of a cell within a tissue, the second file could describe whether a particular gene is expressed in each cell. That way the tissue expression profile can be represented in the spatial context of the tissue.
Datasets can be viewed and compared in up to four “worlds” (each world refers to a separate visualisation sub-window) at the same time. Although browser based, the application, fully written in Javascript, does not need to send any data to the host server. Instead, the modern internet browser’s local file system reading capabilities are used through the HTML 5 FileReader functionality. This allows the application to handle, in a very short period of time, large datasets while ensuring that the privacy of the data is maintained.
Although the focus is on making bioWeb3D simple and easy to use, some options are available to customise how datasets are represented. The application can be used to visualise sequential information, such as 3D protein structures, in which case a solid line can be drawn between the points. In other situations, such as when a population of cells is considered, the points are viewed as individual particles. The information layers are visualised by colouring the 3D points according to the class that each point belongs to.
Technological overview
bioWeb3D is fully written in HTML/Javascript. It relies heavily upon a relatively recent 3D javascript library called Three.js [13]. This library is used as the main interface between WebGL (cross-platform, royalty-free web standard for a low-level 3D graphics API) [14] and javascript. More specifically, bioWeb3D allows the generation and manipulation of simple Three.js objects. Indeed the primary challenge associated with the creation of bioWeb3D has been to design interactions between the 3D visualisation and the user interface in the most efficient way.
The 3D data are rendered using simple 2D quadrilaterals positioned in the 3D space according to their coordinates. This simple technique has been selected to keep bioWeb3D as light-weight as possible whilst ensuring good quality visualisation performance and fluidity.
Listing 1 Json dataset file
Defining the input file formats
JSON is the recommended format to input files into bioWeb3D because of its rigorous structure and its fast object generation, which is directly built into all of the primary internet browsers’ interpreter. Compared to other data-interchange languages, such as XML, JSON is also easily human readable thanks to a light-weight syntax. However, some applications might output data only in an XML format and not JSON, as the latter is generally more web oriented. For this reason bioWeb3D can also accept XML as an input format.
Furthermore, much data generated in the biological sciences is stored within CSV files. Converting CSV documents to the JSON or XML format is not always trivial. In order to facilitate this process, the application is also able to directly render simple CSV files that follow a certain format as an input.
Dataset file specification
When the user adds a new Dataset file, a new Dataset section is created in the “Data” panel of the application. Each dataset file contains one dataset.
JSON format
The datasetfile should have a root object called “dataset” which contains:
-
The “name” property of the dataset (e.g., “my dataset”);
-
The “chain” parameter, which should be set to true if the points are connected (the default value is false) - the data will be considered sequentially, with each point connected by a solid line to the previous and next point according to its order in the dataset file;
-
The “points” property, which is a two dimensional array representing a list of (x,y,z) vectors that define the co-ordinates of the points.
Listing 1 is an example of a minimal 3 points dataset file.
Listing 2 XML dataset file
XML format
The dataset XML format used is very similar to the previously defined JSON format. The file must have a root object called “<dataset>” which contains:
-
The “<name>” property of the dataset (e.g., “my dataset”);
-
The “<chain>” parameter, which should be set to true if the points are linked (the default value is false) - the data will be considered sequentially, with each point connected by a solid line to the previous and next point according to its order in the dataset file;
-
The “<points>” property, which contains all the single “<point>” elements that define the dataset. Each “<point>” has three properties to define its spatial location, namely “<x>”, “<y>” and “<z>”.
Listing 2 contains the same minimal dataset as Listing 1 but formatted in XML.
CSV format
Each line represents a point and the three coordinates on each line must be separated by “comma” characters.
As an example, Listing 3 carries the same information as the JSON file in Listing 1. We note that although the spatial information remains the same it is not possible to set a name or to connect the points within a CSV file input.
Listing 3 CSV dataset file
Information layer file specification
The Information layer file contains information about the points described in the Dataset file. The information in this file has to be given in the same order as the points defined in the Dataset file.
JSON format
The information layerfiles must have a root element named “information”. Since one information file can define multiple information sets, the structure below “information” is a list. Each element of the list is structured as follows:
-
The “name” property (optional);
-
The “numClass” property, which indicates the number of different classes the data will be assigned to;
-
The “labels” property, which defines a list of names for the “numClass” classes previously defined (optional);
-
The “values” property, which defines the class of each point in the dataset. As points do not have single IDs, this property must be in the same order and have the same length as the points defined in the dataset file.
For example coming back to the 3 points defined in Listing 1, two information layers could correspond to:
Listing 4 JSON information layer file
In this case the Information layer file would look like Listing 4.
XML format
The information layer XML format used is very similar to the previously defined JSON format. The information layerfiles must have a root element named “<information>”. Since one information file can define multiple information sets, the structure below “<information>” is a list of “<set>” elements. Each “<set>” element is structured as follows:
-
The “<name>” property (optional);
-
The “<numClass>” property, which indicates the number of different classes the data will be assigned to;
-
The “<labels>” property, which contains as many individual “<label>” properties as the number of different classes. Each “<label>” defines the names for one class (optional);
-
The “<values>” property, which contains all the single “<value>” properties, each one defining the class of each point in the dataset. As points do not have single IDs, the “<value>” properties must be in the same order and have the same length as the points defined in the dataset file.
Listing 5 XML information layer file
Listing 6 CSV informationlayer file
Listing 5 carries the exact same information as Listing 4.
CSV format
Each column represents the class to which a point belongs. The separation character between columns must be a “comma”. Listing 6 carries the same information as Listing 4. Note that it is not possible to use the “labels” or “name” properties available in Listing 4 within a CSV information layer file.