Tuwhy
/

Llama-3.2V-11B-Sherlock-iter1

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

Tuwhy commited on May 29

Commit

33c63bc

·

verified ·

1 Parent(s): 48a2ea3

Update README.md

Files changed (1) hide show

README.md +13 -1

README.md CHANGED Viewed

@@ -8,4 +8,16 @@ datasets:
 - Xkev/LLaVA-CoT-100k
 pipeline_tag: image-text-to-text
 library_name: transformers
----

 - Xkev/LLaVA-CoT-100k
 pipeline_tag: image-text-to-text
 library_name: transformers
+---
+# Sherlock: Self-Correcting Reasoning in Vision-Language Models
+## Introduction
+**Sherlock is a training framework focus on improving Vision-Language Models reasoning and self-correction capabilities.**
+GitHub repo: [https://github.com/DripNowhy/Sherlock](https://github.com/DripNowhy/Sherlock)
+Project Page: [https://dripnowhy.github.io/Sherlock/](https://dripnowhy.github.io/Sherlock/)
+arXiv: [https://arxiv.org/abs/2505.22651](https://arxiv.org/abs/2505.22651)