OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview Image-Text-to-Text • 0.4B • Updated Aug 29 • 43.7k • 82
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19 • 38