TY - JOUR
T1 - Saliency-maximized audio visualization and efficient audio-visual browsing for faster-than-real-time human acoustic event detection
AU - Lin, Kai Hsiang
AU - Zhuang, Xiaodan
AU - Goudeseune, Camille
AU - King, Sarah
AU - Hasegawa-Johnson, Mark
AU - Huang, Thomas S.
PY - 2013/10
Y1 - 2013/10
N2 - Browsing large audio archives is challenging because of the limitations of human audition and attention. However, this task becomes easier with a suitable visualization of the audio signal, such as a spectrogram transformed to make unusual audio events salient. This transformation maximizes the mutual information between an isolated event's spectrogram and an estimate of how salient the event appears in its surrounding context. When such spectrograms are computed and displayed with fluid zooming over many temporal orders of magnitude, sparse events in long audio recordings can be detected more quickly and more easily. In particular, in a 1/10-real-time acoustic event detection task, subjects who were shown saliency-maximized rather than conventional spectrograms performed significantly better. Saliency maximization also improves the mutual information between the ground truth of nonbackground sounds and visual saliency, more than other common enhancements to visualization.
AB - Browsing large audio archives is challenging because of the limitations of human audition and attention. However, this task becomes easier with a suitable visualization of the audio signal, such as a spectrogram transformed to make unusual audio events salient. This transformation maximizes the mutual information between an isolated event's spectrogram and an estimate of how salient the event appears in its surrounding context. When such spectrograms are computed and displayed with fluid zooming over many temporal orders of magnitude, sparse events in long audio recordings can be detected more quickly and more easily. In particular, in a 1/10-real-time acoustic event detection task, subjects who were shown saliency-maximized rather than conventional spectrograms performed significantly better. Saliency maximization also improves the mutual information between the ground truth of nonbackground sounds and visual saliency, more than other common enhancements to visualization.
KW - Acoustic event detection
KW - Audio visualization
KW - Visual salience/saliency
UR - http://www.scopus.com/inward/record.url?scp=84891803843&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84891803843&partnerID=8YFLogxK
U2 - 10.1145/2536764.2536773
DO - 10.1145/2536764.2536773
M3 - Article
AN - SCOPUS:84891803843
SN - 1544-3558
VL - 10
JO - ACM Transactions on Applied Perception
JF - ACM Transactions on Applied Perception
IS - 4
M1 - 26
ER -