Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models Paper • 2505.14071 • Published May 20 • 1
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models Paper • 2505.14071 • Published May 20 • 1
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22 • 34
AION-1: Omnimodal Foundation Model for Astronomical Sciences Paper • 2510.17960 • Published 14 days ago • 27
AION-1: Omnimodal Foundation Model for Astronomical Sciences Paper • 2510.17960 • Published 14 days ago • 27
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22 • 34
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3 • 21
meta-llama/Llama-4-Maverick-17B-128E-Instruct Image-Text-to-Text • 402B • Updated May 22 • 17.1k • • 418
meta-llama/Llama-4-Scout-17B-16E-Instruct Image-Text-to-Text • 109B • Updated May 22 • 206k • • 1.13k
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 300