Abstract
Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 1548-1558 |
| Number of pages | 11 |
| Journal | Journal of the Acoustical Society of America |
| Volume | 143 |
| Issue number | 3 |
| DOIs | |
| State | Published - Mar 1 2018 |
ASJC Scopus subject areas
- Arts and Humanities (miscellaneous)
- Acoustics and Ultrasonics
Fingerprint
Dive into the research topics of 'Tensorial dynamic time warping with articulation index representation for efficient audio-template learning'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS