The multiMarker R package depends on R (\(\ge\) 3.0), and on two further packages: truncnorm (v1.0-8) [15] and ordinalNet (v2.9) [16, 17]. The source code for the R package and a reference manual, containing detailed information on its usage, are available [18].
In the R package, multiMarker and predict.multiMarker are the two main functions. The multiMarker function infers the relationship between multiple biomarkers and food intake; the main arguments of this function are:
-
y: a matrix storing P biomarker measurements on a set of n observations (dimension: \(n \times P\));
-
quantities: a vector storing the food quantities allocated to each of the n observations in the intervention study data (length: n);
-
niter: the number of MCMC iterations;
-
burnIn: the number of MCMC iterations to be discarded prior to computing posterior estimates.
Note that the multiMarker method is independent of both the unit of measure of the P biomarkers and of the types of biofluid from which these are measured. The user can freely set the units of measure, and further decide on the biofluids from which to derive the biomarker measurements. Furthermore, the biomarkers can be quantitative or expressed as relative quantities. Model hyperparameters are computed according to the observed data, as described in [12]. However, users can specify different values using additional arguments of the multiMarker function (see [18]). The output of this function is an object of class multiMarker, storing posterior estimates and MCMC chains, for model parameters and latent intakes.
Function predict.multiMarker facilitates prediction of intake values from biomarker data alone; its main arguments are:
Usage of predict.multiMarker is conditional on the prior estimation of a multiMarker model using data from an intervention study. Moreover, biomarkers considered for prediction should correspond to those of the intervention study, and should be ordered in the same way.
Importantly, in both functions, distributions of parameter estimates and intake predictions are provided, as well as multiple summary statistics: posterior median, posterior standard deviation and \(95\%\) credible intervals. This directly provides informative quantification of the uncertainty associated with the different quantities of interest, often lacking from food intake predictions. Examples for the two functions are provided, as well as example code to produce synthetic data, diagnostic plots for the model parameters and plots of the inferred intake distributions.
The multiMarker Shiny web application builds on the R package to provide non-R-expert researchers with easy access to multiMarker. The Shiny application can be accessed at https://adiet.shinyapps.io/multiMarker/. The first two pages of the application, “About” and “Instructions”, contain a brief overview of the web application’s scope and structure. The main pages of the application are “1. Model Estimation” and “2. Intake Prediction”. Figure 1 reports two flowcharts, illustrating the overall structure of the Shiny web application.
In “1. Model Estimation”, users can upload data from an intervention study. Two different data formats are supported: .csv and .txt. Such data should consist of a matrix with n rows and \((1 + P)\) columns, with the following structure:
Exploratory tools are provided, such as a table containing descriptive summary statistics for the biomarker data, and food quantity-biomarker boxplots (see 1.1 in Fig. 1). Further, intakes’ unit of measure can be specified. The multiMarker model can be run easily using the “Run model estimation” button (see 1.2 in Fig. 1), after having specified the number of MCMC iterations and the desired percentage of iterations for burn-in.
Users are provided with tables containing summary statistics for the estimated intercept (\(\alpha _p\)), scale coefficient (\(\beta _p\)) and errors standard deviation (\(\sigma _p\)) parameters, for the P biomarkers. Further, histograms showing the estimated posterior distributions of these parameters can be produced. The estimated model can be downloaded as an R model object, in a .RData file format (see 1.4 in Fig. 1), for future usage. Last, diagnostic trace plots for the parameters can be produced (see 1.5 in Fig. 1).
In “2. Intake Prediction”, users can upload biomarker data. As in “1. Model Estimation”, only .csv and .txt are supported. Uploaded data should consist of a matrix with \(n^*\) rows and P columns, storing the P biomarker measurements. Moreover, if “2. Intake Prediction” is run in a different session than “1. Model Estimation”, users should upload the .RData file storing the previously estimated model. Descriptive summary statistics for novel biomarker data can be found in a table on the left side of the page. Predictions can be carried out using the “Run intake prediction” button (step 2.1 in Fig. 1), after having specified the number of MCMC iterations and the desired percentage of iterations for burn-in. Histograms presenting the posterior predictive intake distributions for each one of the \(n^*\) observations can be produced (see 2.2 in Fig. 1). Further, diagnostic trace plots can be accessed, as well as a table with summary statistics for the predicted intakes (see 2.3 in Fig. 1).
All plots produced in “1. Model Estimation” and “2. Intake Prediction” can be downloaded by the users, in a .png format. Last, an example dataset is pre-loaded, which allows users to explore the web application’s functionality.