TY - GEN
T1 - When lyrics outperform audio for music mood classification
T2 - 11th International Society for Music Information Retrieval Conference, ISMIR 2010
AU - Hu, Xiao
AU - Stephen Downie, J.
PY - 2010
Y1 - 2010
N2 - This paper builds upon and extends previous work on multi-modal mood classification (i.e., combining audio and lyrics) by analyzing in-depth those feature types that have shown to provide statistically significant improvements in the classification of individual mood categories. The dataset used in this study comprises 5,296 songs (with lyrics and audio for each) divided into 18 mood categories derived from user-generated tags taken from last.fm. These 18 categories show remarkable consistency with the popular Russell's mood model. In seven categories, lyric features significantly outperformed audio spectral features. In one category only, audio outperformed all lyric features types. A fine grained analysis of the significant lyric feature types indicates a strong and obvious semantic association between extracted terms and the categories. No such obvious semantic linkages were evident in the case where audio spectral features proved superior.
AB - This paper builds upon and extends previous work on multi-modal mood classification (i.e., combining audio and lyrics) by analyzing in-depth those feature types that have shown to provide statistically significant improvements in the classification of individual mood categories. The dataset used in this study comprises 5,296 songs (with lyrics and audio for each) divided into 18 mood categories derived from user-generated tags taken from last.fm. These 18 categories show remarkable consistency with the popular Russell's mood model. In seven categories, lyric features significantly outperformed audio spectral features. In one category only, audio outperformed all lyric features types. A fine grained analysis of the significant lyric feature types indicates a strong and obvious semantic association between extracted terms and the categories. No such obvious semantic linkages were evident in the case where audio spectral features proved superior.
UR - http://www.scopus.com/inward/record.url?scp=84555199091&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84555199091&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84555199091
SN - 9789039353813
T3 - Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010
SP - 619
EP - 624
BT - Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010
Y2 - 9 August 2010 through 13 August 2010
ER -