Skip to main navigation Skip to search Skip to main content

Kernel Multimodal Continuous Attention

  • Alexander Moreno
  • , Zhenke Wu
  • , Supriya Nagesh
  • , Walter Dempsey
  • , James M Rehg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Attention mechanisms average a data representation with respect to probability weights. Recently, [23-25] proposed continuous attention, focusing on unimodal exponential and deformed exponential family attention densities: the latter can have sparse support. [8] extended to multimodality via Gaussian mixture attention densities. In this paper, we propose using kernel exponential families [4] and our new sparse counterpart, kernel deformed exponential families. Theoretically, we show new existence results for both families, and approximation capabilities for the deformed case. Lacking closed form expressions for the context vector, we use numerical integration: we prove exponential convergence for both families. Experiments show that kernel continuous attention often outperforms unimodal continuous attention, and the sparse variant tends to highlight time series peaks.

Original languageEnglish (US)
Title of host publication36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherCurran Associates Inc.
Pages18046-18059
Number of pages14
ISBN (Electronic)9781713871088
StatePublished - 2022
Externally publishedYes

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Fingerprint

Dive into the research topics of 'Kernel Multimodal Continuous Attention'. Together they form a unique fingerprint.

Cite this