What matters when building vision-language models? Paper โข 2405.02246 โข Published May 3, 2024 โข 103