Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma
Paper
•
2412.14197
•
Published
This model is a fine-tuned version of google/paligemma-3b-pt-224 on the Malaysian license plate dataset.
from PIL import Image
import torch
from transformers import PaliGemmaProcessor, PaliGemmaForConditionalGeneration, BitsAndBytesConfig, TrainingArguments, Trainer
import time
model = PaliGemmaForConditionalGeneration.from_pretrained('NYUAD-ComNets/VehiclePaliGemma',torch_dtype=torch.bfloat16)
input_text ="extract the text from the image"
processor = PaliGemmaProcessor.from_pretrained("google/paligemma-3b-pt-224")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
input_image = Image.open(image_path)
inputs = processor(text=input_text, images=input_image, padding="longest", do_convert_rgb=True, return_tensors="pt").to(device)
inputs = inputs.to(dtype=model.dtype)
with torch.no_grad():
output = model.generate(**inputs, max_length=500)
result=processor.decode(output[0], skip_special_tokens=True)[len(input_text):].strip()
The following hyperparameters were used during training:
@article{aldahoul2025multitasking,
title={Multitasking vision language models for vehicle plate recognition with VehiclePaliGemma},
author={AlDahoul, Nouar and Tan, Myles Joshua Toledo and Tera, Raghava Reddy and Abdul Karim, Hezerul and Lim, Chee How and Mishra, Manish Kumar and Zaki, Yasir},
journal={Scientific Reports},
volume={15},
number={1},
pages={1--15},
year={2025},
publisher={Nature Publishing Group}
}
@misc{aldahoul2024advancingvehicleplaterecognition,
title={Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma},
author={Nouar AlDahoul and Myles Joshua Toledo Tan and Raghava Reddy Tera and Hezerul Abdul Karim and Chee How Lim and Manish Kumar Mishra and Yasir Zaki},
year={2024},
eprint={2412.14197},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.14197},
}
Base model
google/paligemma-3b-pt-224