RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation Paper • 2601.05241 • Published about 1 month ago • 24
Act2Goal: From World Model To General Goal-conditioned Policy Paper • 2512.23541 • Published Dec 29, 2025 • 22
LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry Paper • 2512.19629 • Published Dec 22, 2025 • 26
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference Paper • 2512.01031 • Published Nov 30, 2025 • 25
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published Sep 8, 2025 • 32
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control Paper • 2508.21112 • Published Aug 28, 2025 • 77
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7, 2025 • 48