A framework for evaluating multimodal music mood classification

Xiao Hu, Kahyun Choi, J. Stephen Downie

Research output: Contribution to journalArticle

Abstract

This research proposes a framework for music mood classification that uses multiple and complementary information sources, namely, music audio, lyric text, and social tags associated with music pieces. This article presents the framework and a thorough evaluation of each of its components. Experimental results on a large data set of 18 mood categories show that combining lyrics and audio significantly outperformed systems using audio-only features. Automatic feature selection techniques were further proved to have reduced feature space. In addition, the examination of learning curves shows that the hybrid systems using lyrics and audio needed fewer training samples and shorter audio clips to achieve the same or better classification accuracies than systems using lyrics or audio singularly. Last but not least, performance comparisons reveal the relative importance of audio and lyric features across mood categories.

Original languageEnglish (US)
Pages (from-to)273-285
Number of pages13
JournalJournal of the Association for Information Science and Technology
Volume68
Issue number2
DOIs
StatePublished - Feb 1 2017

Keywords

  • automatic categorization
  • music
  • text processing

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'A framework for evaluating multimodal music mood classification'. Together they form a unique fingerprint.

  • Cite this