MHESMMR: a multilevel model for predicting the regulation of miRNAs expression by small molecules

According to the expression of miRNA in pathological processes, miRNAs can be divided into oncogenes or tumor suppressors. Prediction of the regulation relations between miRNAs and small molecules (SMs) becomes a vital goal for miRNA-target therapy. But traditional biological approaches are laborious and expensive. Thus, there is an urgent need to develop a computational model. In this study, we proposed a computational model to predict whether the regulatory relationship between miRNAs and SMs is up-regulated or down-regulated. Specifically, we first use the Large-scale Information Network Embedding (LINE) algorithm to construct the node features from the self-similarity networks, then use the General Attributed Multiplex Heterogeneous Network Embedding (GATNE) algorithm to extract the topological information from the attribute network, and finally utilize the Light Gradient Boosting Machine (LightGBM) algorithm to predict the regulatory relationship between miRNAs and SMs. In the fivefold cross-validation experiment, the average accuracies of the proposed model on the SM2miR dataset reached 79.59% and 80.37% for up-regulation pairs and down-regulation pairs, respectively. In addition, we compared our model with another published model. Moreover, in the case study for 5-FU, 7 of 10 candidate miRNAs are confirmed by related literature. Therefore, we believe that our model can promote the research of miRNA-targeted therapy.


Introduction
As an emerging biomarker for medical and diagnostics, microRNA (miRNA) is a small single-stranded endogenously-initiated non-coding RNA molecule [1].Since Ambros et al. discovered the first miRNA lin-4, about 28,000 miRNA molecules have been found in animals, plants and some viruses [2,3].Previously, the genomic structure and subtypes of protein, such as transcription factors and epigenetic mediators, were regarded as the only regulators of gene expression.However, researchers reveal the critical role of miRNAs in post-transcriptional regulatory mechanisms.Mature miRNA can bind to the 3'-untranslated region end of target mRNA, which triggers a decrease in the expression level of specific DNA [4].This also suggests that miRNA expression levels affect multiple cellular functions, such as embryonic development, regulating substance metabolism, mediating signal transduction, cell division and apoptosis [5,6].In the human body, over 60% of transcription is regulated by miRNAs [7].Since each miRNA can regulate the expression of many genes, each miRNA can regulate multiple cellular signalling pathways at the same time [8].
Cell activity is inseparable from the post-transcriptional regulation of miRNA.Meanwhile, many research papers indicated that the dysregulation of miRNA is related to disease occurrence, most notably cancer.Whether over-expression of carcinogenic miR-NAs (oncomiRs) or down-regulation of tumor suppressor miRNAs (TSmiRs) may cause malignant tumours [9,10].Thus, miRNAs can be regarded as a biomarker for diagnosis [11].People conducted a kind of medical treatment strategy based on the miRNA, called miRNA-target therapeutics [12,13].Its main modality is to regulate the expression level of oncomiRs or TSmiRs through SM.Since the special tertiary structure of miRNA, SM can bind to miRNA with high affinity and specificity.For example, Naro et al. discovered the first SM inhibitor of miRNA for suppressing the expression of miR-21 by the luciferase-base screening of more than 300,000 small molecules [14].Miravirsen, a kind of oligonucleotide-based miR-122 inhibitor, has entered clinical trials and is well tolerated in non-human primates, which greatly reduces the burden of HCV and liver cancer [15].Chandrasekhar et al. identified that aza-Flavanones could be an inhibitor of miR-4644, which was helpful to arrest and eliminate human breast tumor cells [16].Besides, for a long time, it is extensively supposed that only proteins can be used as drug targets.But in fact, only about 600 kinds of disease modification proteins can be targeted by drugs.miRNA-targeted drugs are an important supplement to the pharmaceutical industry [17,18].In summary, discovering the regulation relation between miRNAs and small molecules harbours major implications for advancing miRNA-target therapeutics and drug development.
So far, the methods for discovering miRNA-target SM drugs can be divided into three categories.The first category is the high-throughput screening approach which uses high-throughput screening techniques to identify SM inhibitors or activators of miR-NAs.For example, Zhang et al. presented a method based on miRNA 3D structure to discover miRNA-target SM which can regulate miRNA activity [19].They utilized MCfold to obtain miRNA structure.Similar to using Auto Dock to calculate the affinities between binding sites and ligand, they computed RNA-compatible score of SMs by molecule docking-based high-throughput screening techniques.Another category of approaches considers the structure of RNA base sequence.The most famous case is the web server of Inforna developed by Disney et at., which predicts the association between SM-miRNA through motif alignment on a large scale in the databases [20].The third category of the method is based on fluorescence detection assays.Bose et al. proposed a new method for identifying SM targeting miRNA in vitro using a molecule beacon [21].The oligonucleotide hybridization probes are labelled with a fluorophore and a quencher when the beacon binds to the target miRNA.These studies have been instrumental in developing novel miRNA targeting SM drugs and old SM drug repositioning.Anyways, detecting the regulation of miRNAs expression by SMs through biological experiments is time-consuming and labor-intensive because the Bio-data is diverse and voluminous.Therefore, researchers intensified studies into developing computational methods to predict the association between SMs and miRNAs, hoping to narrow down the candidate drug searching scope and accelerate the process of drug development.
In recent years, a series of diverse computational models have been proposed to predict the association between miRNAs and SMs [22].These miRNA-SM association prediction methods can be divided into two categories.The first category is sequence similartiy-based methods.For example, Lv et al. constructed an integrated SM-miRNA association network that combines the miRNA self-similarity network, SM self-similarity network and the known SM to miRNA targeting relationship network [23].And they performed the improved random walk with restart algorithm (RWR) on the integrated SM-miRNA network, which allowed the random walk to learn samples on the various layers of the network.Finally, they ranked miRNA by the relevance score to each SMs, thus screening for potential miRNA targeting SMs.Jiang et al. leveraged the functional similarity of gene expression profiles under drug treatment and miRNA perturbation for SM-miRNA association prediction [24].Meng et al. proposed the predicting model RWNS based on a three layers network including miRNA, SMs and diseases [25].They considered multiple functional similarities such as SM chemical structure similarity, disease phenotype-based similarity and miRNA targeted gene functional consistencybased similarity.The integrated multiple types of functional similarities were constructed in a three layers network and implemented the random walk algorithm on the network.Deepthi et al. conducted a method to predict the relation between SM drugs and miRNA via the convolutional neural network (CNN).The miRNA similarity network and the SM similarity network were used as the features of miRNA and SM.The principal component analysis was implemented to reduce the dimensions of features and the CNN model was trained to extract the high-order information.Finally, they used the support vector machines for identifying the potential relation between miRNAs and SMs.Besides, Guan et al. developed the SM-miRNA association prediction model called the GISMMA model with the graphlet interaction-based inference [26].The graphlet interaction aimed at describing the complex relationship between the miRNA similarity network and the SM similarity network.By counting the number of 28 types of graphlet interaction isomers, the GISMMA model can yield the predicted score of the potential relation between miRNA and SM.The second category is heterogeneous networkbased methods.Li et al. presented the SMiR-NBI model to find miRNAs that can be the potential biomarkers for anticancer drugs.They constructed the SMiR-NBI model by a network-based inference.Specifically, they first initialized the resource scores of miRNAs based on the SM-miRNA adjacent matrix.Then the resource of miRNA was averagely distributed among the SM drugs that were directly linked to that miRNA in the network.Similarly, the SM drugs redistributed the resources to adjacent miRNAs after they integrated the resource from adjacent miRNAs.The final resource score of each miRNA represents the probability that it can be used as the biomarker for a certain anti-cancer drug.Wang et al. presented an approach of a triple layer heterogeneous network (TLHNSMA) to predict the association between SMs and miRNAs [27].They exploited the functional similarities and relationships of miRNAs, SMs and diseases to construct a triple layers network.Then they developed an interactive updating algorithm to propagate the information across the three layers heterogeneous network.Anyways, there are three major disadvantages of these methods.First, most of the previous methods can only predict whether the SM can interact with miRNA but ought not to predict the regulation relation of the SM to the miRNA.These methods are unable to satisfy drug development and target selection because miRNAs may function as oncomiRs or TSmiRs.Thus, the key to advancing the research progress of miRNA-targeted therapy is to identify the SM modulators that inhibit oncomiRs and activate TSmiRs.Second, since most methods rely on the functional similarity of miRNA and SMs, these methods are constrained by complex side information.Therefore, there is a urgent need of an efficient and accurate auxiliary tool for the prediction of the SM regulation with miRNA.
One of the challenges in predicting the association between miRNA and SM is to identify whether their regulatory relationship is up-regulated or down-regulated.To address this challenge, we were inspired by the successful application of the attributed multilayer heterogeneous network for predictions of multi-typed associations between miR-NAs and diseases [28].In this study, for predicting the miRNA-SM regulation relation, we introduced the attributed multi-layer heterogeneous network containing miRNA self-similarity and SM self-similarity.And we proposed a novel multilevel model called MHESMMR.The multilevel mdoel is composed of attributed multi-layer heterogeneous network and networks embedding methods.In detail, our proposed model consists of three steps.First, we carry out the LINE algorithm on the miRNA self-similarity and SM self-similarity for generating node features and then utilizes these node features to construct the attributed multi-layer heterogeneous network of miRNAs and SMs.And then, the GATNE algorithm is used for learning the representation features from the attributed multi-layer heterogeneous network.Finally, we feed these features into the Light-GBM classifier to identify the probable SM modulators.To evaluate the performance of the proposed model, we predict the SM2miR under fivefold cross-validation.Furthermore, we compared the proposed model with other node feature extraction methods and machine learning classifiers, and the experiment results prove that the proposed model is a robust and efficient auxiliary tool for screening SM modulators for miRNA.

Dataset
In the experiment, we collected the data about the regulation relation between SMs and miRNAs to evaluate the performance of the proposed model from the latest version of the SM2miR database [29].The SM2miR database is a manually curated database that collected numerous SM's effects on miRNA expression validated by the previous literature.According to the expression patterns of miRNA, the SM2miR database was divided into two parts, up-regulated pairs and down-regulated pairs, which correspond to Dataset 1 and Dataset 2, respectively.After pre-processing steps, we obtained 541 miRNA, 831 SM drugs and 2377 miRNA-SM pairs.Among these, 1394 up-regulation pairs belong to Dataset 1 and 983 down-regulation pairs belong to Dataset 2. The known SM-miRNA regulation relation pairs were regarded as positive samples.
In general, we describe a bipartite heterogeneous network of SM-miRNA regulation relations in which SM drugs and miRNAs are represented by nodes, and the relationships between them are represented by edges.The imbalanced problems may introduce bias into the experiment results.Thus, the same number of positive samples should be selected from unlabelled samples to generate the negative samples.In theory, the unlabelled samples selected in this manner may involve some potential SM-miRNA relation pairs.To do so, we carry out a negative sample selecting method based on the sequence proximity as similarly used by Yu et al. for negative sampling [30].In terms of SM drugs, we generate MACCS fingerprints from SMILES to represent the SM drug chemical structure by the "RDKit" python package [31,32].To measure the proximity between each SM drug, we calculate the value of Tanimoto coefficients, a quantitative way for sequence alignment, based on their MACCS fingerprint.
Then, the regulation relations between any SMs and any miRNAs was computed.For example, we suppose that the regulation relation between miRNA1 and SM1 is unknown but miRNA1 can be inhibited by SM2, SM3 and SM4.The regulation relations between miRNA1 and SM1 can be calculated as follow: where s denotes the mean value of Tanimoto coefficients of SM1-SM2, SM1-SM3, and SM1-SM4.We computed all of the regulation relations for unlabelled SM-miRNA pairs in the same way.Only the pairs of regulation relations score less than 0.1 were selected as the negative samples.Finally, we selected 1394 negative samples for Dataset 1 and 983 negative samples for Datset2.

Node attributes of heterogeneous network by graph embedding
Graph embedding methods allow distributed representation of network structure, which can be divided into three categories including node embedding, edge embedding and substructure embedding.The node representation maps the nodes to the embedding space and each node can be represented by a vector.By doing this, the node embedding data containing the topological information of the graph are very effective inputs relative to the machine learning model for downstream classification tasks.
The LINE is a graph embedding method based on neighbourhood similarity assumptions proposed by Tang et al. and it is suitable for a weighted network [33].In a complex network, if two vertices are direct neighbours, they are considered to have first-order proximity.On the other hand, if there are multiple first-order proximity vertices between two nodes, they are considered to have second-order proximity.From these two aspects, the main idea of the LIEN algorithm can be divided into two parts.
First-order proximity is to describe the local similarity in the graph.And the LINE with first-order proximity can only be applied to the undirected graph.The joint probability p 1 between two vertices v i and v j on the edge e(i, j) can be defined as: where u i and u j are the low-dimensional the low-dimensional representation vectors of v i and v j .It can describe the relationship between vertices from the perspective of embed- ding space.The distribution p( * , * ) over the space V × V is defined as Formula (2).And its empirical probability p1 can be defined as: (1) where w ij denotes the weight of the edge between vertices v i and v j , and W denotes the sum of all weights of the edges.The goal of our optimization formula is to minimize the difference between p 1 and p1 , so the objective function is defined as follows: where d() represents the function used to measure the difference between two kinds of distributions.And the Kullback-Leibler (KL) divergence can be introduced to the above formula to replace the d( * , * ) .The final optimized formula is defined as: Thus, all of the vertices can be represented as {� u i } i=1...|V| in the d-dimensional space by optimizing the objective function.
The LINE also considers the second-order proximity between vertices.And the LINE with second-order proximity can be applied on both directed and undirected graphs.For a directed edge e(i, j) , the probability that vertex v i and vertex v j are directly connected can be defined as: where |V | denotes the number of vertices in the graph.And the empirical distribution is defined as: where d i denotes the out-degree of v i and w ij denotes the weight of the edge e(i, j) .In order to make the low-dimensional representation of the conditional distribution of context p 2 (•|v i ) as close as possible to the empirical distribution p2 (•|v i ) , the objective function can be defined as: where α i denotes the prestige of the vertex v i and set as the degree of the vertex v i in this study.As mentioned above, d( * , * ) is replaced by KL-divergence.Thus, the final optimi- zation function is defined as: Finally, each vertex can be represented by a d-dimensional vector u i by finding {� u i } i=1...|V| after minimizing the objective function.We applied the LINE algorithm to the miRNA selfsimilarity network calculated by the Tanimoto Coefficient.After graph embedding, if the properties of the two miRNAs are very similar, the embedding vectors between them will also be very close.We also performed the same operation on the SM self-similar network.

Attributed multiplex heterogeneous network embedding
With the development of graph embedding, or network representation learning, exploring non-linear properties are critically important in extracting topological information from heterogeneous networks.There is an emerging graph embedding technology, called general attributed multiplex heterogeneous network embedding (GATEN).The GATNE algorithm aims to integrate the attribute features of the nodes and the multiple relationships between different types of nodes.Furthermore, it can project the information of nodes and nonlinear relationships in the network into a relatively low-dimensional representation vector.Figure 1 shows the GATNE algorithm in inductive mode.
We assume that a relationship graph G with a set of vertices If the vertices and edges are of more than one type, G is a multi-layer hetero- geneous network that represents as G r = (V , E r , A) and r denotes the types of relationships between two vertices.In general, the GATNE aggregates neighbour information and attributes information from the inductive context to the current vertices and generates feature vectors for each vertex at different layers.The GATNE is an inductive learning model with the combination of two parts: base embedding and edge embedding.
The base embedding of vertex v i is shared in different types of edges.The based embed- ding b i is calculated by a transform function defined as follow: where h z is a transformation function of attribute feature x i of vertex v i and the corre- sponding vertex type is represented by z. (11) b i = h z (x i )

Fig. 1 Illusion of GATNE in inductive mode
In the edge embedding, the initial edge embedding u (0) i,r for vertices is constructed by the transformation function with vertices attribute features A . as input.GraphSAGE is a graph neural network technology based on information aggregation [34].The GATNE draws from the neighbour aggregator of the GraphSAGE to aggregate edge embedding vectors of vertex v i on layer r .The initial edge embedding and the mean aggregator func- tion are as following: where the transformation function of z type vertex v i in relation r is denoted as g z,r .u (K )  i,r denoted the K-th level edge embedding after aggregation and N i,r represent the neighbour of vertex v i in relation r .Then all edge embedding u i,r in relation r of vertex v i are concatenated as U i with size s-by-m, where s represents the dimension of edge embeddings: The self-attention mechanism is performed on the U i to calculate the coefficients c i,r ∈ R m of linear combination of edge embedding in U i on relation type of r , the func- tion is formula as: where w r and W r are the trainable parameters of relation type r and trained by optimiza- tion framework.
In general, the embedding representation vector of miRNAs and SM molecules on relation type r are computed by the jointly optimization function as follow: where b i is the based embedding of vertex v i .αr is the hyper-parameter indicating the proportion of edge embedding in the entire embedding.And M r ∈ R s×d is trainable transformation matrix.
In parameter optimization framework, the GATNE integrated base embedding and edge embedding by the random walk and skip gram model on the attributed multilayer heterogeneous network [35,36].Except random walk, meta-path-based methods are also commonly used in research in the field of bioinformatics in recent years.Metapaths can be used to mine similarities and influences among network nodes.Based on these meta-paths, the similarity or weight between different nodes can be calculated to obtain more accurate recommendation results.At the same time, new relationships can also be discovered through meta-paths to improve the diversity and innovation of prediction models [37,38].The meta-path-based random walk is used to generate vertices sequences to learn embedding.In detail, we suppose a graph G r = (V , E r , A) and a meta-path scheme T : V 1 → V 2 → ...V t ... → V l , where l is the length of the meta-path scheme.And the transition probability of random walk is defined as: (12) where v i ∈ V t and N i,r is the neighbourhood of vertices v i in relation type r .. The meta- path-based random walk aims at digging out the semantic relationship between two different types of vertices for integrating by the skip-gram model.Finally, the objective function is defined as: where C is the context of vertex v i in the path P = (v 1 , ...v l ) and c k is the embedding of vertex v i .σ represents the sigmoid function and L is the number of negative samples equal to positive samples.Among v k is randomly drawn from the distribution Pt(v) which defined on the set of corresponding vertices v i .

LightGBM
In this study, we introduce a maching learning method as the classifier.LightGBM is a type of machine learning algorithm based on Gradient Boosting Decision Tree (GBDT) [39].It is an efficient and fast gradient boosting framework developed by Microsoft.The lightGBM algorithm contains two novel techniques, namely Gradient-based One-Side Sampling (GOSS) and the Exclusive Feature Bundling (EFB), which can handle a large number of data instances and a large number of data features without overfitting problem, respectively [40].LightGBM uses a histogram-based decision tree algorithm to discretize continuous features into discrete histogram features, thereby reducing data storage space and computational complexity.LightGBM uses a growth strategy called leaf-wise.The leaf-wise growth strategy selects the current optimal leaf node for splitting each time, which can quickly find the direction in which the loss function decreases the fastest, thus speeding up the training of the model.

MHESMMR
In this work, owing to effective application of network embedding techniques in the bioinformatic field in the post-genomic era, we propose a novel computaional method named MHESMMR to predict multiple regulatory relations between miRNAs and SMs.MHESMMR can be describe in following five steps: (1) use the dataset to construct a multi-layer heterogeneous network, (2) construct the self-similarity networks of SM and miRNA by Tanimoto coeffcient, (3) generating node features by using LINE algorithm on the miRNA self-similarity and SM self-similarity network, (4) apply GATNE algorithm to aggregate the behavior information from the attributed multi-layer heterogeneous network for learning representation features(5) identify the probable SM modulators by the machine learning classifier, where the feature vectors of miRNA-SM are obtained by concatenating two representation features of corresponding miRNAs and SMs.The flowchart of the MHESMMR model is shown in Fig. 2. (

Performance evaluation criterion
To validate the performance of the proposed model, we implemented a series of evaluation criteria.And fivefold cross-validation is adopted to ensure the rigor of the experiment.In detail, the positive samples and negative samples are equally divided into 5 folds.In each round of fivefold cross-validation, one of the folds is used as a testing sample set so that the prediction scores can be used using the proposed method.These prediction scores can reflect the possibility that an SM drug can regulate the expression of a miRNA.In our performance evaluation, if the positive sample in the test set has a high predictive score and the negative sample has a low predictive score, this indicates that the proposed model has good performance.Moreover, we monitored accuracy (Acc.), sensitivity (Sen), specificity (Spec.)and Matthews Correlation Coefficient (MCC) to comprehensively evaluate the proposed model as follows: where TP is the number of positive samples that prediction score is higher than the threshold; FN is the number of positive samples that prediction score is lower than the threshold; FP is the number of negative samples that prediction score is higher than (19)   the threshold; TN is the number of negative samples that prediction score is lower the threshold, respectively.To show the results more intuitively, we drew the receiver operating characteristic (ROC) curves and precision-recall (PR) curves.The area under the ROC curve (AUC) and area under PR (AUPR) were also used for the evaluation of model performance [41,42].If the value of AUC is 0.5 that denotes a purely random prediction and 1 denotes a perfect prediction.

Sensitivity analysis on parameters
To obtain the best prediction performance, we performed the sensitivity analysis on the base embedding dimension and the edge embedding dimension.In this part, the sensitivity analysis was conducted on two hyper-parameters of the GATNE algorithm.Figure 3 illustrates the line chart of average AUC values, which was generated by the LightGBM classifier and influenced by the features dimension and the edge embedding dimension.It can be observed that when the base dimension is set to 128, the best results are obtained on the two data sets.The proposed model gets the best results when the edge embedding dimension of data set 1 and data set 2 is 64 and 32 respectively.
In addition, an additional experiment was carried out to prove the effectiveness of the node attribute features generated by the self-similarity networks.Specifically, we removed the node attribute features and utilized the transductive mode to generate node features just relying on the network structure.As expected, without any node attribute feature, the MHESMMR model yielded average AUCs of 0.937 and 0.9509 on Dataset1 and Dataset2, which is lower than that obtained with attribute feature inputs.Figure 4 displays the ROC curve for this experiment.These results prove the combination of the attribute feature and the graph topology feature can improve the prediction performance.

Assessment of prediction ability
To evaluate the prediction ability of the MHESMMR model while avoiding overfitting, we conducted fivefold cross-validation experiment on two datasets for our proposed model.To maintain consistency, all parameters of these experiments were consistent in this study.In Dataset1, we achieved the average results of Acc., Prec., Sen., MCC, AUC and AUPR of 90.55%, 92.73%, 89.25%, 82.10% 0.9624, 0.9607 and the standard deviations of 0.91%, 1.65%, 2.03%, 1.8%, 0.0065, 0.0050, respectively.In Dataset2, we obtained the average evaluation criteria of 90.97%, 92.74%, 85.28%, 79.98%, 0.9622, 0.9605 and the standard deviations of 1.51%, 1.76%, 3.25%, 2.94%, 0.0099, 0.1102, respectively.The results of the proposed model are summarized in the Table 1 and 2 when adopting the fivefold cross-validation on two datasets.The ROC and PR curves of the fivefold cross-validation experiment are shown in Figs. 5 and 6.All these results indicated a reliable predictive ability of our model.

Ablation experiments
In MHESMMR model, the feature construction can be dived into two modules: node attribute feature construction and graph embedding feature construction.In ablation experiments, we verify which parts of the model contribute the most to the final performance.We constructed two prediction models using only one kinds of feature construction method.The first model only uses the GATNE algorithm to construct features, namely MHESMMR(G), which initial features of the attributed heterogeneous network are set to unit vectors.The second model is called MHESMMR(A), in which the extracted node features are directly input into the classifier to obtain prediction results.To ensure the fairness of the experiment, the same parameters and data set were used in all experiments.The experimental results were objectively recorded in Table 3.For the convenience of comparison, Fig. 7 was used to describe the comparison between the data in the ablation experiment and the original data.Figure 7 shows that the best prediction results can be achieved by combining the two models.Among them, MHESMMR(G) has a better prediction effect than MHESMMR(A), which proves that GATNE algorithm makes a greater contribution to the overall model.

Performance by different classifiers
Machine learning algorithms are widely used in molecular interaction prediction models [43].In order to prove the superiority of our classification strategy, we selected a number of classic algorithms commonly used in the field of bioinformatics to replace our classification method and compare the results.In the experiment, we used several popular machine learning algorithms to construct the prediction model including Logistic Regression (LR), Navi Bayes (NB), Support Vector Machines (SVM), Random Forest (RF) and LightGBM [44][45][46][47][48].The performance of models based on Dataset1 from fivefold cross-validation is shown in Figs. 8 and 9.

Method comparison
To my knowledge, the only computational model that can predict two types of regulatory relationship (up-regulated or down-regulated) between miRNAs and SMs is      could be an accurate and efficient computational model for the prediction of the regulation of miRNAs expression by SM on a large scale.

Case study
For validating the performance of MHESMMR model on predicting potentially the regulation of miRNAs expression by SM.We conducted a case study identifying miRNA targets of specific drugs.5-FU (CID 3385) was selected as the designated drug for this case study.5-FU is a kind of common chemotherapy drugs for cancer [50].It can inhibit the proliferation of cancer cells by changing the metabolism of RNA and DNA to reduce the synthesis of specific proteins [51][52][53][54].Therefore, we utilized the Dataset2 to construct the down-regulation pairs prediction model.We removed all of relation pairs between 5-FU and all of miRNAs in Dataset2 and then implement MHESMMR model based on rest SM-miRNA relation pairs.The prediction results are shown in Table 5.According to the Table 5, among the potential 5-fu-related miRNAs with the top 10 highest prediction scores, seven of them were proved by the PubMed literature to be inhibited by 5-FU.

Conclusion
It is well known that the abnormal expression of miRNAs is an important role in various pathological processes.Through SM drugs, the oncomiRs can be down-regulated and the TSmiRs can be up-regulated.Therefore, an efficient miRNA drug regulatory relationship prediction model is needed.In this study, we developed an innovative computational method for the prediction of regulation relation between miRNA-SM based on graph embedding and machine learning named MHESMMR.It combines the LINE algorithm, the GATNE algorithm and the LightGBM method.And it shows the usefulness of non-linear relationships in identifying the potential miRNA-SM associations.For evaluating the performance of the proposed model, we divide the up-regulation pairs and down-regulation pairs in the SM2miR dataset into Dataset1 and Dataset2.And we performed tests on these two datasets under a fivefold cross-validation.The MHESMMR model yielded average accuracies of 79.59% and 80.37% for Dataset1 and Dataset2, respectively.In addition, we compare the proposed model with another existing model to verify the predictive ability of our model.We also compare the LightGBM method with other classical machine learning classifiers.The experimental results demonstrated that the MHESMMR model is a valuable tool to predict miRNA-SM regulation relations.In the future, we intend to search for more effective feature extraction methods and develop diverse feature descriptors to construct better prediction models.

Fig. 2
Fig.2Framework of the MHESMMR model to predict miRNA-SM regulatory relations

Fig. 4
Fig. 4 Difference of prediction performance using MHESMMR model with/without attribute feature input on Dataset1 (a) and Dataset2 (b)

Table 1
fivefold cross-validation performance for Dataset1

Table 2
fivefold cross-validation performance for Dataset2

Table 3
Ablation experiment result on Dataset1 and Dataset2

Table 4 ,
the AUC values of the MHESMMR model are 8.72% higher than the PSRR model in up-regulation pairs and 9.04% higher than the PSRR model in down-regulation pairs.The AUPR values of the MHESMMR model are 10.03% higher than the PSRR model in up-regulation pairs and 8.74% higher than the PSRR model in down-regulation pairs.These results clarified that the MHESMMR model, with the benefit from attributed multiplex heterogeneous network embedding,

Table 4
Comparison of experimental results of MHESMMR and PSSR

Table 5
The top 10 predicted miRNAs interacted with the 5-FU