Quick Dense Retrievers Consume KALE: Post Training Kullback.Leibler Alignment of Embeddings for Asymmetrical dual encoders

Daniel Campos, Alessandro Magnani, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we investigate the impact of pre and post-training compression on the MSMARCO, Natural Questions, TriviaQA, SQUAD, and SCIFACT, finding that asymmetry in the dual-encoders in dense retrieval can lead to improved inference efficiency. Knowing this, we introduce Kullback-Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods by pruning and aligning the query encoder after training. Specifically, KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation. Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.

Original languageEnglish (US)
Title of host publication4th Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023 - Proceedings of the Workshop
EditorsNafise Sadat Moosavi, Iryna Gurevych, Yufang Hou, Gyuwan Kim, Jin Kim Young, Tal Schuster, Ameeta Agrawal
PublisherAssociation for Computational Linguistics (ACL)
Pages59-77
Number of pages19
ISBN (Electronic)9781959429791
StatePublished - 2023
Externally publishedYes
Event4th Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023 - Toronto, Canada
Duration: Jul 13 2023 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference4th Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023
Country/TerritoryCanada
CityToronto
Period7/13/23 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Quick Dense Retrievers Consume KALE: Post Training Kullback.Leibler Alignment of Embeddings for Asymmetrical dual encoders'. Together they form a unique fingerprint.

Cite this