Skip to main content

Table 5 Summary of parameters used in fine-tuning

From: Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT

Summary of all parameters used: (fine-tuning)

Optimizer used

AdamW

Training batch size

32

Checkpoint saved

500

Learning rate

0.00001

Training steps

10k

Warm-up steps

320