Multi-version music search using acoustic feature union and exact soft mapping

Yi Yu, Kazuki Joe, Vincent Oria, Fabian Moerchen, J Stephen Downie, L. E.I. Chen

Research output: Contribution to journalArticle

Abstract

Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real "multi-version" audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).

Original languageEnglish (US)
Pages (from-to)209-234
Number of pages26
JournalInternational Journal of Semantic Computing
Volume3
Issue number2
DOIs
StatePublished - Jun 1 2009

Fingerprint

acoustics
music
Acoustics
Invariance
efficiency
Refining
Demonstrations
indexing
Semantics
Values
semantics
regression
evaluation

Keywords

  • Query-by-audio
  • exact locality sensitive mapping/hashing
  • feature union
  • music information retrieval
  • musical audio sequence summarization

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Linguistics and Language
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Multi-version music search using acoustic feature union and exact soft mapping. / Yu, Yi; Joe, Kazuki; Oria, Vincent; Moerchen, Fabian; Downie, J Stephen; Chen, L. E.I.

In: International Journal of Semantic Computing, Vol. 3, No. 2, 01.06.2009, p. 209-234.

Research output: Contribution to journalArticle

Yu, Yi ; Joe, Kazuki ; Oria, Vincent ; Moerchen, Fabian ; Downie, J Stephen ; Chen, L. E.I. / Multi-version music search using acoustic feature union and exact soft mapping. In: International Journal of Semantic Computing. 2009 ; Vol. 3, No. 2. pp. 209-234.
@article{449bed0a1f344a0895c1ee5b02ac9018,
title = "Multi-version music search using acoustic feature union and exact soft mapping",
abstract = "Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real {"}multi-version{"} audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).",
keywords = "Query-by-audio, exact locality sensitive mapping/hashing, feature union, music information retrieval, musical audio sequence summarization",
author = "Yi Yu and Kazuki Joe and Vincent Oria and Fabian Moerchen and Downie, {J Stephen} and Chen, {L. E.I.}",
year = "2009",
month = "6",
day = "1",
doi = "10.1142/S1793351X09000732",
language = "English (US)",
volume = "3",
pages = "209--234",
journal = "International Journal of Semantic Computing",
issn = "1793-351X",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "2",

}

TY - JOUR

T1 - Multi-version music search using acoustic feature union and exact soft mapping

AU - Yu, Yi

AU - Joe, Kazuki

AU - Oria, Vincent

AU - Moerchen, Fabian

AU - Downie, J Stephen

AU - Chen, L. E.I.

PY - 2009/6/1

Y1 - 2009/6/1

N2 - Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real "multi-version" audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).

AB - Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real "multi-version" audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).

KW - Query-by-audio

KW - exact locality sensitive mapping/hashing

KW - feature union

KW - music information retrieval

KW - musical audio sequence summarization

UR - http://www.scopus.com/inward/record.url?scp=78650993995&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650993995&partnerID=8YFLogxK

U2 - 10.1142/S1793351X09000732

DO - 10.1142/S1793351X09000732

M3 - Article

AN - SCOPUS:78650993995

VL - 3

SP - 209

EP - 234

JO - International Journal of Semantic Computing

JF - International Journal of Semantic Computing

SN - 1793-351X

IS - 2

ER -