Update README.md
Browse files
README.md
CHANGED
|
@@ -61,7 +61,7 @@ base_model:
|
|
| 61 |
- **Model:** `CulturalPangea-7B` is an open-source Multilingual Multimodal LLM fine-tuned to interpret and reason about long-tail cultural entities and concepts. It is designed to bridge the cultural gap often present in MLLMs.
|
| 62 |
- **Date:** `CulturalPangea-7B` was trained in 2025.
|
| 63 |
- **Training Dataset:** The model was fine-tuned on the [CulturalGround](https://huggingface.co/datasets/neulab/CulturalGround) dataset, using 14 million open-ended and 6 million multiple-choice culturally-grounded VQA pairs samples from 30M total samples(22M OE, 8M MCQs). This was interleaved with the substantial portion of original Pangea instruction data to maintain general abilities.
|
| 64 |
-
- **Architecture:** `CulturalPangea-7B` is a fine-tuned version of [Pangea-7B](https://huggingface.co/neulab/Pangea-7B). It uses a frozen (https://huggingface.co/openai/clip-vit-large-patch14) vision encoder with a [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) LLM backbone. During training, only the connector and the language model were fine-tuned.
|
| 65 |
|
| 66 |
## Uses
|
| 67 |
|
|
|
|
| 61 |
- **Model:** `CulturalPangea-7B` is an open-source Multilingual Multimodal LLM fine-tuned to interpret and reason about long-tail cultural entities and concepts. It is designed to bridge the cultural gap often present in MLLMs.
|
| 62 |
- **Date:** `CulturalPangea-7B` was trained in 2025.
|
| 63 |
- **Training Dataset:** The model was fine-tuned on the [CulturalGround](https://huggingface.co/datasets/neulab/CulturalGround) dataset, using 14 million open-ended and 6 million multiple-choice culturally-grounded VQA pairs samples from 30M total samples(22M OE, 8M MCQs). This was interleaved with the substantial portion of original Pangea instruction data to maintain general abilities.
|
| 64 |
+
- **Architecture:** `CulturalPangea-7B` is a fine-tuned version of [Pangea-7B](https://huggingface.co/neulab/Pangea-7B). It uses a frozen [CLIP-ViT](https://huggingface.co/openai/clip-vit-large-patch14) vision encoder with a [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) LLM backbone. During training, only the connector and the language model were fine-tuned.
|
| 65 |
|
| 66 |
## Uses
|
| 67 |
|