Toward Cognitive Supersensing in Multimodal Large Language Model Paper • 2602.01541 • Published 13 days ago • 16
PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling Paper • 2506.20936 • Published Jun 26, 2025 • 12
Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation Paper • 2509.10687 • Published Sep 12, 2025 • 7
RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes Paper • 2509.15123 • Published Sep 18, 2025 • 5
RigMo: Unifying Rig and Motion Learning for Generative Animation Paper • 2601.06378 • Published Jan 10 • 12
RigMo: Unifying Rig and Motion Learning for Generative Animation Paper • 2601.06378 • Published Jan 10 • 12
Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles Paper • 2309.10228 • Published Sep 19, 2023
On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation Paper • 2411.11913 • Published Nov 17, 2024
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 53
Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization Paper • 2511.14846 • Published Nov 18, 2025
The Unanticipated Asymmetry Between Perceptual Optimization and Assessment Paper • 2509.20878 • Published Sep 25, 2025 • 4
SocialGesture: Delving into Multi-person Gesture Understanding Paper • 2504.02244 • Published Apr 3, 2025
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs Paper • 2506.21656 • Published Jun 26, 2025 • 16
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing Paper • 2503.13434 • Published Mar 17, 2025 • 27
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25, 2025 • 75
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Paper • 2412.09349 • Published Dec 12, 2024 • 8