Aggregation of multiple judgments for evaluating ordered lists

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many tasks (e.g., search and summarization) result in an ordered list of items. In order to evaluate such an ordered list of items, we need to compare it with an ideal ordered list created by a human expert for the same set of items. To reduce any bias, multiple human experts are often used to create multiple ideal ordered lists. An interesting challenge in such an evaluation method is thus how to aggregate these different ideal lists to compute a single score for an ordered list to be evaluated. In this paper, we propose three new methods for aggregating multiple order judgments to evaluate ordered lists: weighted correlation aggregation, rank-based aggregation, and frequent sequential pattern-based aggregation. Experiment results on ordering sentences for text summarization show that all the three new methods outperform the state of the art average correlation methods in terms of discriminativeness and robustness against noise. Among the three proposed methods, the frequent sequential pattern-based method performs the best due to the flexible modeling of agreements and disagreements among human experts at various levels of granularity.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 32nd European Conference on IR Research, ECIR 2010, Proceedings
PublisherSpringer
Pages166-178
Number of pages13
ISBN (Print)3642122744, 9783642122743
DOIs
StatePublished - 2010
Event32nd European Conference on Information Retrieval, ECIR 2010 - Milton Keynes, United Kingdom
Duration: Mar 28 2010Mar 31 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5993 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other32nd European Conference on Information Retrieval, ECIR 2010
Country/TerritoryUnited Kingdom
CityMilton Keynes
Period3/28/103/31/10

Keywords

  • Evaluation
  • Frequent sequential pattern mining
  • Judgment aggregation
  • Sentence ordering

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Aggregation of multiple judgments for evaluating ordered lists'. Together they form a unique fingerprint.

Cite this