Topic Modeling Users' Interpretations of Songs to Inform Subject Access in Music Digital Libraries

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The assignment of subject metadata to music is useful for organizing and accessing digital music collections. Since manual subject annotation of large-scale music collections is labor-intensive, automatic methods are preferred. Topic modeling algorithms can be used to automatically identify latent topics from appropriate text sources. Candidate text sources such as song lyrics are often too poetic, resulting in lower-quality topics. Users' interpretations of song lyrics provide an alternative source. In this paper, we propose an automatic topic discovery system from web-mined user-generated interpretations of songs to provide subject access to a music digital library. We also propose and evaluate filtering techniques to identify high-quality topics. In our experiments, we use 24,436 popular songs that exist in both the Million Song Dataset and songmeanings.com. Topic models are generated using Latent Dirichlet Allocation (LDA). To evaluate the coherence of learned topics, we calculate the Normalized Pointwise Mutual Information (NPMI) of the top ten words in each topic based on occurrences in Wikipedia. Finally, we evaluate the resulting topics using a subset of 422 songs that have been manually assigned to six subjects. Using this system, 71% of the manually assigned subjects were correctly identified. These results demonstrate that topic modeling of song interpretations is a promising method for subject metadata enrichment in music digital libraries. It also has implications for affording similar access to collections of poetry and fiction.

Original languageEnglish (US)
Title of host publicationJCDL 2015 - Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages183-186
Number of pages4
ISBN (Electronic)9781450335942
DOIs
StatePublished - Jun 21 2015
Event15th ACM/IEEE-CE Joint Conference on Digital Libraries, JCDL 2015 - Knoxville, United States
Duration: Jun 21 2015Jun 25 2015

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
Volume2015-June
ISSN (Print)1552-5996

Other

Other15th ACM/IEEE-CE Joint Conference on Digital Libraries, JCDL 2015
Country/TerritoryUnited States
CityKnoxville
Period6/21/156/25/15

Keywords

  • interpretations of lyrics
  • music digital library
  • topic models

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Topic Modeling Users' Interpretations of Songs to Inform Subject Access in Music Digital Libraries'. Together they form a unique fingerprint.

Cite this