TY - GEN
T1 - Real-time implementation and performance optimization of 3D sound localization on GPUs
AU - Liang, Yun
AU - Cui, Zheng
AU - Zhao, Shengkui
AU - Rupnow, Kyle
AU - Zhang, Yihao
AU - Jones, Douglas L.
AU - Chen, Deming
PY - 2012
Y1 - 2012
N2 - Real-time 3D sound localization is an important technology for various applications such as camera steering systems, robotics audition, and gunshot direction. 3D sound localization adds a new dimension, but also significantly increases the computational requirements. Real-time 3D sound localization continuously processes large volumes of data for each possible 3D direction and acoustic frequency range. Such highly demanding compute requirements outpace current CPU compute abilities. This paper develops a real-time implementation of 3D sound localization on Graphical Processing Units (GPUs). Massively parallel GPU architectures are shown to be well suited for 3D sound localization. We optimize various aspects of GPU implementation, such as number of threads per thread block, register allocation per thread, and memory data layout for performance improvement. Experiments indicate that our GPU implementation achieves 501X and 130X speedup compared to a single-thread and a multi-thread CPU implementation respectively, thus enabling real-time operation of 3D sound localization.
AB - Real-time 3D sound localization is an important technology for various applications such as camera steering systems, robotics audition, and gunshot direction. 3D sound localization adds a new dimension, but also significantly increases the computational requirements. Real-time 3D sound localization continuously processes large volumes of data for each possible 3D direction and acoustic frequency range. Such highly demanding compute requirements outpace current CPU compute abilities. This paper develops a real-time implementation of 3D sound localization on Graphical Processing Units (GPUs). Massively parallel GPU architectures are shown to be well suited for 3D sound localization. We optimize various aspects of GPU implementation, such as number of threads per thread block, register allocation per thread, and memory data layout for performance improvement. Experiments indicate that our GPU implementation achieves 501X and 130X speedup compared to a single-thread and a multi-thread CPU implementation respectively, thus enabling real-time operation of 3D sound localization.
UR - http://www.scopus.com/inward/record.url?scp=84862069040&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862069040&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84862069040
SN - 9783981080186
T3 - Proceedings -Design, Automation and Test in Europe, DATE
SP - 832
EP - 835
BT - Proceedings - Design, Automation and Test in Europe Conference and Exhibition, DATE 2012
T2 - 15th Design, Automation and Test in Europe Conference and Exhibition, DATE 2012
Y2 - 12 March 2012 through 16 March 2012
ER -