Semantic learning for audio applications: A computer vision approach

Rahul Sukthankar, Yan Ke, Derek Hoiem

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent work in machine learning has significantly benefited semantic extraction tasks in computer vision, particularly for object recognition and image retrieval. We argue that the computer vision techniques that have been successfully applied in those settings can effectively be translated to other domains, such as audio. This claim is supported by recent results in music vs. speech classification, structure from sound, robust music identification and sound object recognition. This paper focuses on two such audio applications and demonstrates how ideas from computer vision map naturally to these problems.

Original languageEnglish (US)
Title of host publication2006 Conference on Computer Vision and Pattern Recognition Workshop
DOIs
StatePublished - 2006
Event2006 Conference on Computer Vision and Pattern Recognition Workshops - New York, NY, United States
Duration: Jun 17 2006Jun 22 2006

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2006
ISSN (Print)1063-6919

Other

Other2006 Conference on Computer Vision and Pattern Recognition Workshops
CountryUnited States
CityNew York, NY
Period6/17/066/22/06

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Semantic learning for audio applications: A computer vision approach'. Together they form a unique fingerprint.

  • Cite this

    Sukthankar, R., Ke, Y., & Hoiem, D. (2006). Semantic learning for audio applications: A computer vision approach. In 2006 Conference on Computer Vision and Pattern Recognition Workshop [1640555] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2006). https://doi.org/10.1109/CVPRW.2006.191