MIENTURNET web tool was implemented by using the R programming language (Release 3.4.4, March 2018) for statistical computing and graphics (http://www.rproject.org/). The whole web framework was developed based on the shiny package (version 1.2) from RStudio (http://shiny.rstudio.com). Indeed, shiny is a free, open source, extensible package that allows to create an interactive web interface for sharing analysis and graphics from R. The performance of shiny package was widely tested and validated by several successful web applications [18–21]. Additional R packages used to create MIENTURNET include: visNetwork (ver. 2.0.4), igraph (ver. 1.2.2) [22], shinyWidgets (ver. 0.4.4), shinyBS (ver. 0.61) and clusterProfiler (ver. 3.8.1) [23].
Data retrieval
Simulated data
We generated simulated expression profiles for 100 samples spanning 12440 genes that are available in Targetscan (Release 7.2) [11] and for 100 samples spanning 14888 genes that are available in miRTarBase (Release 7.0, September 2017) [24]. In each simulated dataset E, the expression level of each gene g in each sample s was independently randomly drawn from a standard normal distribution, E(g,s)∼N(0,1), and 10 random miRNAs were simulated to be active at different levels of influence: the first repressing its top 100 targets, the second repressing its top 200 targets, the third repressing its top 300 targets, and so on until the last repressing its top 1000 targets. In addition, for each simulated data set, the level of miRNAs activity α is increased from 0.3 to 1 with 0.05 steps. The miRNA repression was simulated by reducing the expression levels of the miRNA targets for 50 of the 100 samples. The reduction level is set equal to (α + ε) with ε drawn from a standard normal distribution. This means that the expression value for a target g of an active miRNA in an affected sample s is given by E(g,s)=N(0,1)−(α+N(0,1)).
Expression profiling
miRNA expression profiles from six different human tissue types were obtained from DASHR 2.0 [25], which is a comprehensive database of human small non-coding RNA genes and mature products. Repeat measurements of the same tissue were averaged resulting in one profile for each tissue type, and miRNAs whose expression levels were greater than the 75th percentile of the tissue type distribution were selected as the most tissue-representative miRNAs. Protein expression levels from the same human tissue types based on immunohistochemistry using tissue microarrays were downloaded from the Human Protein Atlas version 18.1 [26]. For each tissue type, proteins with high expression levels in that tissues were selected as the most tissue-representative proteins. Note that both definitions of the most tissue-representative miRNAs and proteins ensure that they are highly expressed at least in a certain tissue, but it does not require they are expressed only in that tissue.
Positive predictive value
To test the performance of MIENTURNET in capturing the most tissue-representative miRNAs, we computed the positive predictive value (PPV) for different tissues, which is defined as [27]:
$$\begin{aligned} PPV = \frac{\text{number of true positives}}{(\text{number of false positives} + \text{number of true positives})} \end{aligned} $$
where a “true positive” is an outcome where the model correctly predicts the positive class, whereas a “false positive” is an outcome where the model incorrectly predicts the positive class. In our analysis, PPV represents the number of the most tissue-representative miRNAs on the total number of miRNAs identified by MIENTURNET targeting an input list of the most tissue-representative proteins. The ideal value of the PPV in a perfect test is 1 (100%), and the worst possible value would be zero. The PPV statistic is often called “precision” and a small positive predictive value (e.g. PPV < 50%) indicates that more than half of the positive results from the testing procedure are false positives.
miRNA-target interactions
The predictions of miRNA targets and the information about the miRNA family members with their seed were downloaded from TargetScan web site (Release 7.2, March 2018) [11]. The experimentally validated miRNA-target interactions were downloaded from miRTarBase web site (Release 7.0, September 2017) [24].
Currently, MIENTURNET supports the choice of six organisms shared from TargetScan and miRTarBase, that is: Human (Homo sapiens), Mouse (Mus musculus), Rat (Rattus norvegicus), Worm (Caenorhabditis elegans), Fruit fly (Drosophila melanogaster), Zebrafish (Danio rerio).
All miRNA entries are annotated according to the latest miRBase database (Release 22, March 2018) [28], while all mRNA entries are annotated according to the latest NCBI database (Release 227, August 2018) [29].
Tool description
The flowchart of MIENTURNET is shown in Fig. 1. MIENTURNET is devised for:
-
receiving in input a list of genes according to Official Gene Symbol (e.g. PTEN for human species, Pten for mouse species) and inferring possible evidences (computational or experimental) of miRNA regulation based on a statistical analysis for over-representation of miRNA-target interactions;
-
receiving in input a list of mature miRNAs according to miRBase ID (e.g. hsa-miR-15a-5p) and inferring possible evidences (computational or experimental) of their regulation on target genes based on a statistical analysis for over-representation of miRNA-target interactions.
The resulting miRNA-target interactions are visualized as a network and then analyzed according their topological features.
MIENTURNET performs a miRNA-target enrichment analysis (Fig. 2a) by calculating the following statistics (i.e. p-value resulting from the hypergeometric test):
$$p = 1- \sum_{i=0}^{X-1} \frac{\binom{K}{i}\binom{M-K}{N-i}}{\binom{M}{N}} $$
where M is the dimension of the universe, that is the number of all predicted (validated) miRNA-target interactions encompassed in TargetScan (miRTarBase); N is the length of the input list; K is the number of predicted (validated) miRNA-target interactions encompassed in TargetScan (miRTarBase) for a selected gene or miRNA according to the type of the input list; X is the number of predicted (validated) miRNA-target interactions encompassed in the input list for the selected gene or miRNA.
MIENTURNET allows to visualize the resulting miRNA-target interactions as a network that can be filtered, explored, and customized interactively in order to improve its visualization and understanding (Fig. 2b). For example, by choosing miRTarBase database, the results can be filtered according to the type of evidence categories used by miRTarBase to validate the miRNA-target interactions: ‘Strong’ for considering strong experimental methods (e.g., Luciferase assay, Western); ‘Weak’ for considering weaker experimental evidence (e.g., CLIP); ‘Strong and Weak’ for considering both strong and weak experimental methods. In addition, MIENTURNET computes the topological properties for each node in the miRNA-target interaction network (i.e. degree, betweenness, average shortest path length, eccentricity, clustering coefficient) in order to find nodes displaying a central role (Fig. 2c). Then, it estimates the nodes degree distribution (i.e. the probability distribution of degrees over the whole network) along with the power-law fit, in order to determine whether the network exhibits a scale-free behaviour (Fig. 2c), consistent with how emerged so far in almost all biological networks [30–32]. MIENTURNET offers also the possibility to perform a functional enrichment analysis of the targets of selected miRNAs (Fig. 2d), in order to gain insight into understanding the biological processes underlying the target gene activity. For this analysis, currently the choice is among the following annotation databases: KEGG pathways [33], Reactome [34], WikiPathways [35] and Disease Ontology [36] (only with Homo sapiens).
MIENTURNET reports numerous output files containing the results of its analyses (i.e. miRNA-target enrichment analysis, selected miRNA-target functional enrichment analysis and network analysis). These files are simple tabular output files that can be viewed with any spreadsheet application (such as Microsoft Excel). However, browsing these files by eye is not especially easy, and working with data across multiple files can be quite difficult and could require nontrivial scripts. MIENTURNET drastically simplifies data exploration task by creating, for each of the performed analyses, publication-ready plots.