Parameters | Tuned range | Optimal |
---|---|---|
Max sequence length | 128 | 128 |
Training batch size | [16, 32, 64] | 64 |
Development batch size | 8 | 8 |
Test batch size | 8 | 8 |
Training epochs | 50 | 50 |
Warmup proportion | 0.1 | 0.1 |
Classifier dropout rate | [0.0, 0.05, 0.1] | 0.0 |
GPNN layers | [1, 2] | 1 |
GPNN input neighbors | [16, 20, 24, 28, 32] | 32 |
GPNN output neighbors | [4, 8, 16] | 4 |