BGFit web-application serves both as: (i) an automated fitting tool for experimental data using an extensible set of dynamic models through a distributed architecture and; (ii) a data repository that stores and manages experimental data.
The data modeling features allow users to choose a dynamic model and estimate the parameters that best describe the dataset. With this information BGFit simulates the estimated curve and presents the results in a chart along with the original dataset and goodness-of-fit measures.
This automated process can be applied to single dataset, or to a collection that aggregates similar or complementary data, such as replicates of an experiment. This provides both a global view on aggregated data and a fine control on specific measurements.
BGFit’s repository of dynamic models allow users to apply their own models, as well as take advantage of an existing and expandable set of contributed models, each bestowing to a richer environment. With this functionality it is possible to compare the results of different fittings in a single dashboard. The models currently implemented are Baranyi, Gompertz, Logistic and Richards models [2, 8], first and second order polynomial regression, exponential decay, Lumry-Eyring - LENP type Ib (ODE) [9] for modeling the kinetics of irreversible protein aggregation, Hyperbolastic growth model of type III (H3) [10] and Live Cell Fraction model [11, 12]. To complement the dynamic modeling feature, users can also apply manual regression on the data, traditionally performed as a linear fitting in logarithmic scale.
While not intended to be exhaustive, this list implements a wide set of algebraic and differential models that are used in many areas or research and serves as a support for future expansions by users.
The data-management features supports the modeling process and facilitates the collaboration by creating a central point of access. One of the motivations for this application is the need to have a better workflow for collaboration, avoiding the exchange of files using traditional methods, such as emails and shared folders. Thus, BGFit features a hiearchical-based data storage where users can define their own teams and attribute read/write permissions accordingly. Additionally the public scope can also be defined, allowing to openly share and publish the data online.
All the input data and results, such as the time series, estimated parameters, model simulations and charts, are available for direct download to further analysis.
The entire source code for BGFit and the implemented models are available online, as well as the instruction to setup a fully functional installation locally. This addresses data confidentiality by allowing each laboratory to keep a local BGFit version for private projects.
Architecture and data structure
BGFit is developed using open-source frameworks and free libraries allowing for a high degree of flexibility and creating a modular system constituted by Ruby on Rails, MySQL, Octave, MathJax and Google Chart Tools.
The application is designed using a model-view-controller architecture effectively separating data-management and dynamic modeling that is performed using extensions that are decoupled from the web-application.
The modelling extensions only require the implementation of the necessary interface and for it to be deployed on a location that is accessible by BGFit. This approach allows for every component of BGFit to be deployed online, encouraging collaboration and the reutilization of these tools. It can also be used in a local installation while keeping the access to all the developed models.
Input data is stored using a hierarchical-based organization with three different layers. The top-level layer, project, defines global properties for the project, such as user permissions and whether it is publicly available. The middle layer, experiments, aggregates the different results in folders. The bottom layer, measurements, is the user’s actual data and can store 3-dimensional annotated data, although only the first two dimensions are used in the modeling extensions for now (Figure 1). BGFit represents a central repository for data, models and fittings.
Modelling extensions
One of the strengths of BGFit is that it allows to easily expand the dynamic models. Modeling the data in the application is performed through a REpresentational State Transfer (REST) web-service that receives a set of parameters as input and returns the function’s result.
The web-service should support two functions and a baseline for comparisons between different models, e.g., root mean square error (RMSE): 1) Parameter estimation - which takes the data-points, such as time series, a range set for the parameters and outputs the estimated parameters using linear/nonlinear regression and 2) Model simulation - that receives a set of parameters for the model as input and returns a simulated curve.
The modeling extension should implement these functions to be fully compliant. This approach forces a strict interface for communication, but on the other hand, it offers flexibility on the implementation of the model as it is technological agnostic.
The necessary technical documents, templates and examples are fully described in the Model Blackbox public repository (https://github.com/averissimo/model_blackbox), providing a starting point for users to create and implement their own interface-compliant models.
The available templates offer two approaches implemented in Octave and Matlab’s numerical computing environments, either as a script for Octave, making it possible to deploy the modeling extensions without any licensing issues, or as a standalone application for Matlab, taking advantage of SBToolbox2 [13] functions.