Sleep & Wellness Guide
SeeTraceAct: Visibility-Aware Latent Planning from Cross-Embodiment Demonstration Videos
Key Takeaway
A robotics research paper on SeeTraceAct: Visibility-Aware Latent Planning from Cross-Embodiment Demonstration Videos.
Practical Tips
Practical tips and how-to guidance will be added by our editorial team.
中文解读
中文解读待补充:本站将优先为睡眠改善、失眠治疗、助眠方法等高价值文章补充中文说明。
Article Summary
Vision-language-action models (VLAs) are promising general-purpose robot policies, but adapting them to new tasks typically requires costly task-specific teleoperation data. As an alternative, we study one-shot demo-conditioned VLAs, where a robot policy is conditioned on a single demonstration video of an unseen task. We find that existing end-to-end approaches often struggle when successful execution requires precisely localizing small target regions. To address this limitation, we propose SeeTraceAct, a demo-conditioned VLA framework that encourages precise spatial grounding through visibility-aware prediction of future end-effector traces. To enable reproducible evaluation with cross-embodiment demonstrations, we introduce and release RoboCasa-DC, a demo-conditioned extension of RoboCasa with episode-paired humanoid videos. Experiments on RoboCasa-DC and a real-world benchmark, where a Franka Panda arm is conditioned on human demonstrations, show that SeeTraceAct outperforms baselines, achieving the best success rate across all four RoboCasa-DC settings and improving real-world average success by 12.5 percentage points.
Sources & References
Need to track a shipment?
Use our free logistics tracking tool to check real-time delivery status for USPS, FedEx, UPS, DHL, Amazon and 1000+ carriers worldwide.
Track a Package Now
Comments