Post
				
				
							935
					๐ Excited to announce the release of InfiMM-WebMath-40B โ the largest open-source multimodal pretraining dataset designed to advance mathematical reasoning in AI! ๐งฎโจ
With 40 billion tokens, this dataset aims for enhancing the reasoning capabilities of multimodal large language models in the domain of mathematics.
If you're interested in MLLMs, AI, and math reasoning, check out our work and dataset:
๐ค HF: InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning (2409.12568)
๐ Dataset: Infi-MM/InfiMM-WebMath-40B
	With 40 billion tokens, this dataset aims for enhancing the reasoning capabilities of multimodal large language models in the domain of mathematics.
If you're interested in MLLMs, AI, and math reasoning, check out our work and dataset:
๐ค HF: InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning (2409.12568)
๐ Dataset: Infi-MM/InfiMM-WebMath-40B

 
								




