Skip to main content

Table. 6 Training hyperparameters

From: AttentionDDI: Siamese attention-based deep learning method for drug–drug interaction predictions

  DS1 DS2 DS3 CYP DS3 NCYP
# Attention heads (H) 2 2 4 2
# transformer units (E) 1 1 1 1
Dropout 0.3 0.3 0.45 0.3
MLP embed factor (\(\xi\)) 2 2 2 2
Pooling mode attn attn attn attn
Distance cosine cosine cosine cosine
Weight decay \(1^{-6}\) \(1^{-6}\) \(1^{-8}\) \(1^{-6}\)
Batch size 1000 1000 400 1000
# epochs 100 100 200 100
\(\gamma\) 0.05 0.05 0.05 0.05
\(\mu\) 1 1 1 1