MLM

This model is a fine-tuned version of google-bert/bert-large-cased on a imdb dataset. It achieves the following results on the evaluation set:

• Eval loss: 3.0162

• Perplexity (PPL): 42.7959

Training hyperparameters

The following hyperparameters were used during training:

• learning_rate: 2e-5

• train_batch_size: 16

• gradient_accumulation_steps=2

• seed: 42

• weight_decay=0.01

• lr_scheduler_type: cosine

• warmup_ratio=0.03

• num_epochs: 2

• fp16: True

• max_grad_norm: 1.0

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(2592)

this model