The MIREX grand challenge: A framework of holistic user-experience evaluation in music information retrieval

Xiao Hu, Jin Ha Lee, David Bainbridge, Kahyun Choi, Peter Organisciak, J. Stephen Downie

Research output: Contribution to journalArticle

Abstract

Music Information Retrieval (MIR) evaluation has traditionally focused on system-centered approaches where components of MIR systems are evaluated against predefined data sets and golden answers (i.e., ground truth). There are two major limitations of such system-centered evaluation approaches: (a) The evaluation focuses on subtasks in music information retrieval, but not on entire systems and (b) users and their interactions with MIR systems are largely excluded. This article describes the first implementation of a holistic user-experience evaluation in MIR, the MIREX Grand Challenge, where complete MIR systems are evaluated, with user experience being the single overarching goal. It is the first time that complete MIR systems have been evaluated with end users in a realistic scenario. We present the design of the evaluation task, the evaluation criteria and a novel evaluation interface, and the data-collection platform. This is followed by an analysis of the results, reflection on the experience and lessons learned, and plans for future directions.

Original languageEnglish (US)
Pages (from-to)97-112
Number of pages16
JournalJournal of the Association for Information Science and Technology
Volume68
Issue number1
DOIs
StatePublished - Jan 1 2017

Fingerprint

Information retrieval systems
Information retrieval
information retrieval
music
evaluation
experience
Computer systems
Evaluation
Music
User experience
scenario
interaction

Keywords

  • evaluation
  • information retrieval
  • music

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management
  • Library and Information Sciences

Cite this

The MIREX grand challenge : A framework of holistic user-experience evaluation in music information retrieval. / Hu, Xiao; Lee, Jin Ha; Bainbridge, David; Choi, Kahyun; Organisciak, Peter; Downie, J. Stephen.

In: Journal of the Association for Information Science and Technology, Vol. 68, No. 1, 01.01.2017, p. 97-112.

Research output: Contribution to journalArticle

@article{79dedfd4ef9f4c9e80beb8060df20167,
title = "The MIREX grand challenge: A framework of holistic user-experience evaluation in music information retrieval",
abstract = "Music Information Retrieval (MIR) evaluation has traditionally focused on system-centered approaches where components of MIR systems are evaluated against predefined data sets and golden answers (i.e., ground truth). There are two major limitations of such system-centered evaluation approaches: (a) The evaluation focuses on subtasks in music information retrieval, but not on entire systems and (b) users and their interactions with MIR systems are largely excluded. This article describes the first implementation of a holistic user-experience evaluation in MIR, the MIREX Grand Challenge, where complete MIR systems are evaluated, with user experience being the single overarching goal. It is the first time that complete MIR systems have been evaluated with end users in a realistic scenario. We present the design of the evaluation task, the evaluation criteria and a novel evaluation interface, and the data-collection platform. This is followed by an analysis of the results, reflection on the experience and lessons learned, and plans for future directions.",
keywords = "evaluation, information retrieval, music",
author = "Xiao Hu and Lee, {Jin Ha} and David Bainbridge and Kahyun Choi and Peter Organisciak and Downie, {J. Stephen}",
year = "2017",
month = "1",
day = "1",
doi = "10.1002/asi.23618",
language = "English (US)",
volume = "68",
pages = "97--112",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "1",

}

TY - JOUR

T1 - The MIREX grand challenge

T2 - A framework of holistic user-experience evaluation in music information retrieval

AU - Hu, Xiao

AU - Lee, Jin Ha

AU - Bainbridge, David

AU - Choi, Kahyun

AU - Organisciak, Peter

AU - Downie, J. Stephen

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Music Information Retrieval (MIR) evaluation has traditionally focused on system-centered approaches where components of MIR systems are evaluated against predefined data sets and golden answers (i.e., ground truth). There are two major limitations of such system-centered evaluation approaches: (a) The evaluation focuses on subtasks in music information retrieval, but not on entire systems and (b) users and their interactions with MIR systems are largely excluded. This article describes the first implementation of a holistic user-experience evaluation in MIR, the MIREX Grand Challenge, where complete MIR systems are evaluated, with user experience being the single overarching goal. It is the first time that complete MIR systems have been evaluated with end users in a realistic scenario. We present the design of the evaluation task, the evaluation criteria and a novel evaluation interface, and the data-collection platform. This is followed by an analysis of the results, reflection on the experience and lessons learned, and plans for future directions.

AB - Music Information Retrieval (MIR) evaluation has traditionally focused on system-centered approaches where components of MIR systems are evaluated against predefined data sets and golden answers (i.e., ground truth). There are two major limitations of such system-centered evaluation approaches: (a) The evaluation focuses on subtasks in music information retrieval, but not on entire systems and (b) users and their interactions with MIR systems are largely excluded. This article describes the first implementation of a holistic user-experience evaluation in MIR, the MIREX Grand Challenge, where complete MIR systems are evaluated, with user experience being the single overarching goal. It is the first time that complete MIR systems have been evaluated with end users in a realistic scenario. We present the design of the evaluation task, the evaluation criteria and a novel evaluation interface, and the data-collection platform. This is followed by an analysis of the results, reflection on the experience and lessons learned, and plans for future directions.

KW - evaluation

KW - information retrieval

KW - music

UR - http://www.scopus.com/inward/record.url?scp=85006836529&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006836529&partnerID=8YFLogxK

U2 - 10.1002/asi.23618

DO - 10.1002/asi.23618

M3 - Article

AN - SCOPUS:85006836529

VL - 68

SP - 97

EP - 112

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 1

ER -