Sleep & Wellness Guide
UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning
Key Takeaway
A robotics research paper on UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning.
Practical Tips
Practical tips and how-to guidance will be added by our editorial team.
中文解读
中文解读待补充:本站将优先为睡眠改善、失眠治疗、助眠方法等高价值文章补充中文说明。
Article Summary
Egocentric video understanding is inherently limited by the narrow perspective of wearable cameras: a single viewpoint, a single modality, a single model cannot capture the full richness of human action. We argue that a truly expressive egocentric representation must subsume complementary knowledge across viewpoints, modalities, and foundation model representations, yet remain deployable from egocentric video alone. To this end, we introduce a hierarchical multi-teacher distillation framework that produces UNIEGO, a unified egocentric encoder trained with nine teachers spanning ego-exo viewpoints, RGB, depth, and skeleton modalities, and four foundation models. Rather than distilling directly from heterogeneous teachers whose incompatible architectures and feature geometries induce conflicting gradients, our framework interposes a layer of representation-specific Proxy models that translate diverse teacher knowledge into a homogeneous egocentric space. A second distillation stage, Selective Proxy Distillation (SPD), then adaptively selects, for each training sample, the subset of proxies that are both correct and confident, distilling exclusively from reliable supervision and suppressing erroneous signals. SPD is further stabilized by initializing UNIEGO as a learned convex combination of proxy parameters, placing the unified model in a well-conditioned region of the loss landscape before distillation begins. UNIEGO achieves state-of-the-art performance across three egocentric video understanding tasks - action recognition, video retrieval, and action segmentation on three challenging ego-exo benchmarks, outperforming naive multi-teacher distillation baselines and demonstrating that structured, proxy-mediated knowledge transfer yields richer and more discriminative egocentric representations.
Sources & References
Need to track a shipment?
Use our free logistics tracking tool to check real-time delivery status for USPS, FedEx, UPS, DHL, Amazon and 1000+ carriers worldwide.
Track a Package Now
Comments