TY - GEN
T1 - YouTubePD: A Multimodal Benchmark for Parkinson’s Disease Analysis
AU - Zhou, Andy
AU - Li, Samuel
AU - Sriram, Pranav
AU - Li, Xiang
AU - Dong, Jiahua
AU - Sharma, Ansh
AU - Zhong, Yuanyi
AU - Luo, Shirui
AU - Jaromin, Maria
AU - Kindratenko, Volodymyr
AU - Heintz, George
AU - Zallek, Christopher
AU - Wang, Yu-Xiong
PY - 2023
Y1 - 2023
N2 - The healthcare and AI communities have witnessed a growing interest in the development of AI-assisted systems for automated diagnosis of Parkinson’s Disease (PD), one of the most prevalent neurodegenerative disorders. However, the progress in this area has been significantly impeded by the absence of a unified, publicly available benchmark, which prevents comprehensive evaluation of existing PD analysis methods and the development of advanced models. This work overcomes these challenges by introducing YouTubePD – the first publicly available multimodal benchmark designed for PD analysis. We crowd-source existing videos featured with PD from YouTube, exploit multimodal information including in-the-wild videos, audios, and facial landmarks across 200+ subject videos, and provide dense and diverse annotations from a clinical expert. Based on our benchmark, we propose three challenging and complementary tasks encompassing both discriminative and generative tasks, along with a comprehensive set of corresponding baselines. Experimental evaluation showcases the potential of modern deep learning and computer vision techniques, in particular the generalizability of the models developed on our YouTubePD to real-world clinical settings, while revealing their limitations. We hope that our work paves the way for future research in this direction.
AB - The healthcare and AI communities have witnessed a growing interest in the development of AI-assisted systems for automated diagnosis of Parkinson’s Disease (PD), one of the most prevalent neurodegenerative disorders. However, the progress in this area has been significantly impeded by the absence of a unified, publicly available benchmark, which prevents comprehensive evaluation of existing PD analysis methods and the development of advanced models. This work overcomes these challenges by introducing YouTubePD – the first publicly available multimodal benchmark designed for PD analysis. We crowd-source existing videos featured with PD from YouTube, exploit multimodal information including in-the-wild videos, audios, and facial landmarks across 200+ subject videos, and provide dense and diverse annotations from a clinical expert. Based on our benchmark, we propose three challenging and complementary tasks encompassing both discriminative and generative tasks, along with a comprehensive set of corresponding baselines. Experimental evaluation showcases the potential of modern deep learning and computer vision techniques, in particular the generalizability of the models developed on our YouTubePD to real-world clinical settings, while revealing their limitations. We hope that our work paves the way for future research in this direction.
UR - https://www.scopus.com/pages/publications/85205666269
UR - https://www.scopus.com/pages/publications/85205666269#tab=citedBy
M3 - Conference contribution
VL - 36
T3 - Advances in Neural Information Processing Systems
SP - 55140
EP - 55159
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Curran Associates Inc.
ER -