| ## Chat RL | |
| timestamp: 2025-10-14 07:06:07 | |
| - run: | |
| - source: sft | |
| - dtype: bfloat16 | |
| - device_batch_size: 8 | |
| - examples_per_step: 16 | |
| - num_samples: 16 | |
| - max_new_tokens: 256 | |
| - temperature: 1.0000 | |
| - top_k: 50 | |
| - unembedding_lr: 0.0040 | |
| - embedding_lr: 0.2000 | |
| - matrix_lr: 0.0200 | |
| - weight_decay: 0.0000 | |
| - init_lr_frac: 0.0500 | |
| - num_epochs: 1 | |
| - save_every: 60 | |
| - eval_every: 60 | |
| - eval_examples: 400 | |