C-PMI: Conditional Pointwise Mutual Information for Turn-level Dialogue Evaluation

Liliang Ren, Mankeerat Sidhu, Qi Zeng, Revanth Gangi Reddy, Heng Ji, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Existing reference-free turn-level evaluation metrics for chatbots inadequately capture the interaction between the user and the system. Consequently, they often correlate poorly with human evaluations. To address this issue, we propose a novel model-agnostic approach that leverages Conditional Pointwise Mutual Information (C-PMI) to measure the turn-level interaction between the system and the user based on a given evaluation dimension. Experimental results on the widely used FED dialogue evaluation dataset demonstrate that our approach significantly improves the correlation with human judgment compared with existing evaluation systems. By replacing the negative log-likelihood-based scorer with our proposed C-PMI scorer, we achieve a relative 60.5% higher Spearman correlation on average for the FED evaluation metric. Our code is publicly available at https://github.com/renll/C-PMI.

Original languageEnglish (US)
Title of host publicationDialDoc 2023 - Proceedings of the 3rd DialDoc Workshop on Document-Grounded Dialogue and Conversational Question Answering, Proceedings of the Workshop
EditorsSmaranda Muresan, Vivian Chen, Casey Kennington, David Vandyke, Nina Dethlefs, Koji Inoue, Erik Ekstedt, Stefan Ultes
PublisherAssociation for Computational Linguistics (ACL)
Pages80-85
Number of pages6
ISBN (Electronic)9781959429982
StatePublished - 2023
Event3rd Workshop on Document-grounded Dialogue and Conversational Question Answering, DialDoc 2023, co-located with ACL 2023 - Toronto, Canada
Duration: Jul 13 2023 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference3rd Workshop on Document-grounded Dialogue and Conversational Question Answering, DialDoc 2023, co-located with ACL 2023
Country/TerritoryCanada
CityToronto
Period7/13/23 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'C-PMI: Conditional Pointwise Mutual Information for Turn-level Dialogue Evaluation'. Together they form a unique fingerprint.

Cite this