Structured audio content analysis and metadata in a digital library

David Bainbridge, John Stephen Downie, Andreas F. Ehmann

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This work illustrates how audio content analysis of music and manually assigned structural temporal metadata can be used to form a digital library designed for musicological exploration. In addition to text-based searching and browsing, the document view is enriched with an interactive structured audio time-line that shows ground-truth data representing the logical segments to the song, and a version that was automatically generated for comparison. A self-similarity "heat" map is also displayed, and is interactive. Clicking within the map at a co-ordinate (x,y) results in the audio being played simultaneous at time offset x and y, panned left and right, respectively, to make it easier for the listener to separate out the differences. The musicologist can also initiate an audio content based query starting at any point in the song. This produces a ranked result set which can be further studied through their respective document views. Alternatively they can perform a musical structure search (for example, for songs that contain the structure b, b, c, b, c).

Original languageEnglish (US)
Title of host publicationJCDL '12 - Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries
Number of pages2
StatePublished - 2012
Event12th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL '12 - Washington, DC, United States
Duration: Jun 10 2012Jun 14 2012

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996


Other12th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL '12
Country/TerritoryUnited States
CityWashington, DC


  • audio content analysis
  • digital libraries
  • structured metadata

ASJC Scopus subject areas

  • General Engineering


Dive into the research topics of 'Structured audio content analysis and metadata in a digital library'. Together they form a unique fingerprint.

Cite this