Tensorial dynamic time warping with articulation index representation for efficient audio-template learning

Long N. Le, Douglas L Jones

Research output: Contribution to journalArticle

Abstract

Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.

Original languageEnglish (US)
Pages (from-to)1548-1558
Number of pages11
JournalJournal of the Acoustical Society of America
Volume143
Issue number3
DOIs
StatePublished - Mar 1 2018

Fingerprint

learning
templates
wildlife
education
audio data
availability
interference
Template
Articulation
Wildlife

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

Tensorial dynamic time warping with articulation index representation for efficient audio-template learning. / Le, Long N.; Jones, Douglas L.

In: Journal of the Acoustical Society of America, Vol. 143, No. 3, 01.03.2018, p. 1548-1558.

Research output: Contribution to journalArticle

@article{a2499021cf024f0e8a4549461cb7bb19,
title = "Tensorial dynamic time warping with articulation index representation for efficient audio-template learning",
abstract = "Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.",
author = "Le, {Long N.} and Jones, {Douglas L}",
year = "2018",
month = "3",
day = "1",
doi = "10.1121/1.5027245",
language = "English (US)",
volume = "143",
pages = "1548--1558",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "3",

}

TY - JOUR

T1 - Tensorial dynamic time warping with articulation index representation for efficient audio-template learning

AU - Le, Long N.

AU - Jones, Douglas L

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.

AB - Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.

UR - http://www.scopus.com/inward/record.url?scp=85044268376&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044268376&partnerID=8YFLogxK

U2 - 10.1121/1.5027245

DO - 10.1121/1.5027245

M3 - Article

VL - 143

SP - 1548

EP - 1558

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 3

ER -