deepseek-ai/DeepSeek-V3.2-Speciale Text Generation β’ 685B β’ Updated 24 days ago β’ 16.1k β’ 621
view article Article Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained β Whatβs Really Changing in Transformers? Apr 4 β’ 15
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19 β’ 38
Running on CPU Upgrade 183 LLM Hallucination Leaderboard π 183 View and filter LLM hallucination leaderboard
intfloat/multilingual-e5-large-instruct Feature Extraction β’ 0.6B β’ Updated Jul 10 β’ 1.42M β’ β’ 587