English

Qwen2-VL-7B-Instruct-Traffic

Qwen2-VL-7B-Instruct-Traffic is a multimodal model fine-tuned on the MITS (Multimodal Intelligent Traffic Surveillance) dataset for intelligent traffic surveillance scenarios.

  • Tasks: recognition, counting, localization, background awareness, reasoning
  • Data: 170,400 images + ~5M instruction-following VQA pairs from MITS
  • Modality: Image + Text β†’ Text
  • Domain: traffic scenes (congestion, accidents, construction, smoke/fireworks, unusual weather, spills, etc.)

Quick Links

Intended Use

  • Urban traffic monitoring, incident analysis, visual question answering for transportation management
  • Research on ITS-specific multimodal reasoning and instruction following

Model Inputs/Outputs

  • Input: an image (traffic scene) + a natural language instruction/question
  • Output: a natural language response (e.g., description, count, event reasoning)

Training Summary

  • Objective: instruction tuning on MITS traffic QA
  • Backbone family: Qwen2-VL 7B Instruct
  • Notes: align vision-language features to traffic-centric concepts and events

Limitations & Notes

  • The model may make mistakes on rare objects or extreme weather/night scenes not well represented in training.
  • Not a safety-critical system; human verification is required for real-world decisions.

License

  • Follow the licenses of this model and the MITS dataset as stated on their ModelScope pages.

Citation

If you use this model or dataset, please cite:

@article{zhao2025mits,
  title   = {MITS: A large-scale multimodal benchmark dataset for Intelligent Traffic Surveillance},
  author  = {Zhao, Kaikai and Liu, Zhaoxiang and Wang, Peng and Wang, Xin and Ma, Zhicheng and Xu, Yajun and Zhang, Wenjing and Nan, Yibing and Wang, Kai and Lian, Shiguo},
  journal = {Image and Vision Computing},
  pages   = {105736},
  year    = {2025},
  publisher = {Elsevier}
}

Contact

Unicom AI

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LifeIsSoSolong/Qwen2-VL-7B-Instruct-Traffic

Base model

Qwen/Qwen2-VL-7B
Finetuned
(482)
this model

Dataset used to train LifeIsSoSolong/Qwen2-VL-7B-Instruct-Traffic