Abstract
Rendering scenes observed in a monocular video from novel viewpoints is a challenging problem. For static scenes the community has studied both scene-specific optimization techniques, which optimize on every test scene, and generalized techniques, which only run a deep net forward pass on a test scene. In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video. To explore whether generalized dynamic novel view synthesis from monocular videos is possible today, we establish an analysis framework based on existing techniques and work toward the generalized approach. We find a pseudo-generalized process without scene-specific appearance optimization is possible, but geometrically and temporally consistent depth estimates are needed. Despite no scene-specific appearance optimization, the pseudo-generalized approach improves upon some scene-specific methods. For more information see project page at https://xiaoming-zhao.github.io/projects/pgdvs.
Original language | English (US) |
---|---|
State | Published - 2024 |
Externally published | Yes |
Event | 12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria Duration: May 7 2024 → May 11 2024 |
Conference
Conference | 12th International Conference on Learning Representations, ICLR 2024 |
---|---|
Country/Territory | Austria |
City | Hybrid, Vienna |
Period | 5/7/24 → 5/11/24 |
ASJC Scopus subject areas
- Language and Linguistics
- Computer Science Applications
- Education
- Linguistics and Language