TY - GEN
T1 - RoSS
T2 - 2023 IEEE International Conference on Robotics and Automation, ICRA 2023
AU - Seo, Hyungjoo
AU - Karnoor, Sahil Bhandary
AU - Choudhury, Romit Roy
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper considers the problem of audio source separation, where the goal is to isolate a target audio signal (say Alice's speech) from a mixture of multiple interfering signals (e.g., when many people are talking). This problem has gained renewed interest mainly due to the significant growth in voice-controlled devices, including robots in homes, offices, and other public facilities. Although a rich body of work exists on the core topic of source separation, we find that rotational motion of the microphones (e.g., a swiveling robot-head) offers complementary gains. We show that rotating the microphone array to the optimal orientation can produce desirable 'delay aliasing' between two interferers, causing the two interferers to appear as one. In general, a mixture of K signals becomes a mixture of (K - 1) signals, a mathematically concrete gain. We show that the gain translates well to practice, provided two rotation-related challenges can be mitigated. This paper is focused on mitigating these challenges and demonstrating the end-to-end performance on a fully functional prototype. We believe that our Rotational Source Separation (RoSS) module could be plugged into actual robot heads or into other devices (like Amazon Show) that are also capable of rotation.
AB - This paper considers the problem of audio source separation, where the goal is to isolate a target audio signal (say Alice's speech) from a mixture of multiple interfering signals (e.g., when many people are talking). This problem has gained renewed interest mainly due to the significant growth in voice-controlled devices, including robots in homes, offices, and other public facilities. Although a rich body of work exists on the core topic of source separation, we find that rotational motion of the microphones (e.g., a swiveling robot-head) offers complementary gains. We show that rotating the microphone array to the optimal orientation can produce desirable 'delay aliasing' between two interferers, causing the two interferers to appear as one. In general, a mixture of K signals becomes a mixture of (K - 1) signals, a mathematically concrete gain. We show that the gain translates well to practice, provided two rotation-related challenges can be mitigated. This paper is focused on mitigating these challenges and demonstrating the end-to-end performance on a fully functional prototype. We believe that our Rotational Source Separation (RoSS) module could be plugged into actual robot heads or into other devices (like Amazon Show) that are also capable of rotation.
UR - http://www.scopus.com/inward/record.url?scp=85168672741&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85168672741&partnerID=8YFLogxK
U2 - 10.1109/ICRA48891.2023.10161106
DO - 10.1109/ICRA48891.2023.10161106
M3 - Conference contribution
AN - SCOPUS:85168672741
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4026
EP - 4032
BT - Proceedings - ICRA 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 29 May 2023 through 2 June 2023
ER -