mars_300m_8 / README.md
KaiyueWen's picture
Upload folder using huggingface_hub
22a63e0 verified

Model Card

Best configuration

Hyperparameter Value
beta1 0.98
beta2 0.99
epsilon 9.999999999999999e-26
gamma 0.05
learning_rate 0.008
max_grad_norm 1
min_lr_ratio 0
train_batch_size 256
warmup 1000
weight_decay 0.1