nanochat-d20 / report /chat-sft.md
sampathchanda's picture
Upload folder using huggingface_hub
a465972 verified

Chat SFT

timestamp: 2025-10-14 02:27:42

  • run: d0
  • source: mid
  • dtype: bfloat16
  • device_batch_size: 4
  • num_epochs: 1
  • max_iterations: -1
  • target_examples_per_step: 32
  • unembedding_lr: 0.0040
  • embedding_lr: 0.2000
  • matrix_lr: 0.0200
  • weight_decay: 0.0000
  • init_lr_frac: 0.0200
  • eval_every: 100
  • eval_steps: 100
  • eval_metrics_every: 200
  • Training rows: 20,843
  • Number of iterations: 651
  • Training loss: 1.2206
  • Validation loss: 1.0725