TY - GEN
T1 - Pseudo Dataset Generation for Out-of-domain Multi-Camera View Recommendation
AU - Lee, Kuan Ying
AU - Zhou, Qian
AU - Nahrstedt, Klara
N1 - This work was funded by National Science Foundation grants NSF CNS 19-00875, NSF CCF 22-17144, NSF CNS 21- 06592. Any results and opinions are our own and do not represent views of National Science Foundation.
PY - 2024
Y1 - 2024
N2 - Multi-camera systems are indispensable in movies, TV shows, and other media. Selecting the appropriate camera at every timestamp has a decisive impact on production quality and audience preferences. Learning-based view recommendation frameworks can assist professionals in decision-making. However, they often struggle outside of their training domains. The scarcity of labeled multi-camera view recommendation datasets exacerbates the issue. Based on the insight that many videos are edited from the original multi-camera videos, we propose transforming regular videos into pseudo-labeled multi-camera view recommendation datasets. Promisingly, by training the model on pseudo-labeled datasets stemming from videos in the target domain, we achieve a 68% relative improvement in the model's accuracy in the target domain and bridge the accuracy gap between in-domain and never-before-seen domains.
AB - Multi-camera systems are indispensable in movies, TV shows, and other media. Selecting the appropriate camera at every timestamp has a decisive impact on production quality and audience preferences. Learning-based view recommendation frameworks can assist professionals in decision-making. However, they often struggle outside of their training domains. The scarcity of labeled multi-camera view recommendation datasets exacerbates the issue. Based on the insight that many videos are edited from the original multi-camera videos, we propose transforming regular videos into pseudo-labeled multi-camera view recommendation datasets. Promisingly, by training the model on pseudo-labeled datasets stemming from videos in the target domain, we achieve a 68% relative improvement in the model's accuracy in the target domain and bridge the accuracy gap between in-domain and never-before-seen domains.
KW - cinematography
KW - semi-supervised learning
UR - https://www.scopus.com/pages/publications/85218199169
UR - https://www.scopus.com/pages/publications/85218199169#tab=citedBy
U2 - 10.1109/VCIP63160.2024.10849905
DO - 10.1109/VCIP63160.2024.10849905
M3 - Conference contribution
AN - SCOPUS:85218199169
T3 - 2024 IEEE International Conference on Visual Communications and Image Processing, VCIP 2024
BT - 2024 IEEE International Conference on Visual Communications and Image Processing, VCIP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Visual Communications and Image Processing, VCIP 2024
Y2 - 8 December 2024 through 11 December 2024
ER -