Multi-resBind: a residual network-based multi-label classifier for in vivo RNA binding prediction and preference visualization

Table 1 Performance of the Multi-resBind model with various types of input data

Data types	Data dimensions	Mean AUROC	Mean AP
Sequence	(150,4)	0.8809	0.3372
Structure	(150,2)	0.6987	0.1189
Region	(150,4)	0.6710	0.0912
Sequence and structure	(150,6)	0.8843	0.3521
Sequence and region	(150,8)	0.8976	0.3808
Structure and region	(150,6)	0.7602	0.1602
Sequence, structure and region	(150,10)	0.8957	0.3714

Evaluation experiments were performed using different input features and their combinations with a held-out test set in the 27 RBPs low dataset. Among the features, sequence represents one-hot encoded nucleic acid bases (A, U, G, or C), structure represents paired or unpaired structure profiles predicted by a modified script of RNAplfold, and region represents one-hot encoded region type information (3′ UTR, 5′ UTR, CDS, or intron) of the corresponding sequence. The mean AUROC and the mean AP refer to the average AUROC and AP scores of the 27 RBPs in the low dataset. The numbers marked in bold represent the maximum value under the given evaluation metric

ISSN: 1471-2105