Music subject classification based on lyrics and user interpretations

Kahyun Choi, Jin Ha Lee, Xiao Hu, J. Stephen Downie

Research output: Contribution to journalArticlepeer-review


That music seekers consider song subject metadata to be helpful in their searching/browsing experience has been noted in prior published research. In an effort to develop a subject-based tagging system, we explored the creation of automatically generated song subject classifications. Our classifications were derived from two different sources of song-related text: 1) lyrics; and 2) user interpretations of lyrics collected from While both sources contain subject-related information, we found that user-generated interpretations always outperformed lyrics in terms of classification accuracy. This suggests that user interpretations are more useful in the subject classification task than lyrics because the semantically ambiguous poetic nature of lyrics tends to confuse classifiers. An examination of top-ranked terms and confusion matrices supported our contention that users' interpretations work better for detecting the meaning of songs than what is conveyed through lyrics.

Original languageEnglish (US)
Pages (from-to)1-10
Number of pages10
JournalProceedings of the Association for Information Science and Technology
Issue number1
StatePublished - 2016


  • Interpretations of Lyrics
  • Music Digital Library
  • Music Subject Classification
  • Subject Metadata

ASJC Scopus subject areas

  • General Computer Science
  • Library and Information Sciences


Dive into the research topics of 'Music subject classification based on lyrics and user interpretations'. Together they form a unique fingerprint.

Cite this