pytorch
/

gemma-3-12b-it-FP8

text-generation-inference

Model card Files Files and versions

jerryzh168 commited on Oct 16

Commit

d2d3073

·

verified ·

1 Parent(s): 72d7c17

Update README.md

Files changed (1) hide show

README.md +19 -4

README.md CHANGED Viewed

@@ -185,6 +185,7 @@ and use a token with write access, from https://huggingface.co/settings/tokens
 # Model Quality
 We rely on [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) to evaluate the quality of the quantized model. Here we only run on mmlu for sanity check.
 | Benchmark                        |                |                           |
 |----------------------------------|----------------|---------------------------|
@@ -196,18 +197,32 @@ We rely on [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-h
 <details>
 <summary> Reproduce Model Quality Results </summary>
 Need to install lm-eval from source:
 https://github.com/EleutherAI/lm-evaluation-harness#install
-## baseline
 ```Shell
 lm_eval --model hf --model_args pretrained=google/gemma-3-12b-it --tasks mmlu --device cuda:0 --batch_size 8
 ```
-## FP8
 ```Shell
-export MODEL=pytorch/gemma-3-12b-it-FP8
-lm_eval --model hf --model_args pretrained=$MODEL --tasks mmlu --device cuda:0 --batch_size 8
 ```
 </details>

 # Model Quality
 We rely on [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) to evaluate the quality of the quantized model. Here we only run on mmlu for sanity check.
+We also rely on [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval/) for multi-modal quality eval. We only tested on chartqa for sanity check.
 | Benchmark                        |                |                           |
 |----------------------------------|----------------|---------------------------|
 <details>
 <summary> Reproduce Model Quality Results </summary>
+## language eval
 Need to install lm-eval from source:
 https://github.com/EleutherAI/lm-evaluation-harness#install
 ```Shell
+export MODEL=google/gemma-3-12b-it # or pytorch/gemma-3-12b-it-FP8
 lm_eval --model hf --model_args pretrained=google/gemma-3-12b-it --tasks mmlu --device cuda:0 --batch_size 8
 ```
+## multi-modal eval
+Need to install lmms-eval from source:
+`pip install git+https://github.com/EvolvingLMMs-Lab/lmms-eval.git`
 ```Shell
+NUM_PROCESSES=8
+MAIN_PORT=12345
+MODEL_ID=google/gemma-3-12b-it # or pytorch/gemma-3-12b-it-FP8
+TASKS=chartqa  # or tasks from https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/main/lmms_eval/models/simple
+BATCH_SIZE=32
+OUTPUT_PATH=./logs/
+accelerate launch --num_processes "${NUM_PROCESSES}" --main_process_port "${MAIN_PORT}" -m lmms_eval \
+  --model gemma3 \
+  --model_args "pretrained=${MODEL_ID}" \
+  --tasks "${TASKS}" \
+  --batch_size "${BATCH_SIZE}" --output_path "${OUTPUT_PATH}"
 ```
 </details>