Se124M100KInfDelimiter

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4823

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.1538 1.0 2090 0.5678
0.1412 2.0 4180 0.5413
0.1405 3.0 6270 0.5325
0.1355 4.0 8360 0.5241
0.1364 5.0 10450 0.5211
0.1341 6.0 12540 0.5170
0.1312 7.0 14630 0.5123
0.1304 8.0 16720 0.5078
0.1301 9.0 18810 0.5064
0.1286 10.0 20900 0.5058
0.1308 11.0 22990 0.5022
0.1292 12.0 25080 0.5007
0.1287 13.0 27170 0.5005
0.1306 14.0 29260 0.4976
0.1312 15.0 31350 0.4975
0.1268 16.0 33440 0.4963
0.1267 17.0 35530 0.4944
0.1273 18.0 37620 0.4932
0.1243 19.0 39710 0.4925
0.1266 20.0 41800 0.4912
0.127 21.0 43890 0.4914
0.1278 22.0 45980 0.4905
0.1276 23.0 48070 0.4899
0.1285 24.0 50160 0.4888
0.1264 25.0 52250 0.4889
0.1256 26.0 54340 0.4881
0.1251 27.0 56430 0.4876
0.1291 28.0 58520 0.4869
0.1254 29.0 60610 0.4867
0.1268 30.0 62700 0.4863
0.1247 31.0 64790 0.4857
0.126 32.0 66880 0.4855
0.1262 33.0 68970 0.4852
0.1257 34.0 71060 0.4848
0.1246 35.0 73150 0.4846
0.1261 36.0 75240 0.4839
0.1269 37.0 77330 0.4839
0.1244 38.0 79420 0.4836
0.1243 39.0 81510 0.4836
0.1256 40.0 83600 0.4834
0.1237 41.0 85690 0.4827
0.1244 42.0 87780 0.4833
0.1234 43.0 89870 0.4828
0.1255 44.0 91960 0.4824
0.1272 45.0 94050 0.4826
0.1258 46.0 96140 0.4824
0.1264 47.0 98230 0.4825
0.1236 48.0 100320 0.4824
0.1254 49.0 102410 0.4825
0.1242 50.0 104500 0.4823

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M100KInfDelimiter

Adapter
(1633)
this model