Representation Learning for Dynamic 3D Scenes

Yunzhu Li, Jiajun Wu

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter conducts an in-depth exploration of the various methodologies employed in modelling and learning representations for dynamic 3D scenes. The discussion draws upon the intrinsic human ability to interact with and predict changes within the 3D physical world, with the aim of advancing embodied AI systems to reach and exceed these capabilities.

The focus lies on the examination of diverse 3D representation methods, encompassing keypoints, particles, and neural fields among others. Their respective strengths, limitations, and applications within embodied AI tasks are thoroughly investigated. Highlights include the self-supervised learning of 3D object keypoints for model-based predictions, the use of particles in conjunction with graph neural networks (GNNs) to model complex dynamics, and the recent surge in the utilisation of 3D neural fields in world models, using inductive bias.

This chapter demonstrates how these representation learning methods have been instrumental in accomplishing complex control and planning tasks in the 3D world, thus significantly propelling AI capabilities in both simulated and real-world conditions. Concluding with a look towards the future, this chapter outlines potential research directions to continue enhancing the performance and capabilities of embodied intelligence systems.
Original languageEnglish (US)
Title of host publicationDeep Learning for 3D Vision
Subtitle of host publicationAlgorithms and Applications
EditorsXiaoli Li, Xulei Yang, Hao Su
PublisherWorld Scientific
Pages91-158
Number of pages68
ISBN (Electronic)9789811286490
ISBN (Print)9789811286483
DOIs
StatePublished - Sep 2024

Fingerprint

Dive into the research topics of 'Representation Learning for Dynamic 3D Scenes'. Together they form a unique fingerprint.

Cite this