Visual attention model for name tagging in multimodal social media

Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, Heng Ji

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Everyday billions of multimodal posts containing both images and text are shared in social media sites such as Snapchat, Twitter or Instagram. This combination of image and text in a single message allows for more creative and expressive forms of communication, and has become increasingly common in such sites. This new paradigm brings new challenges for natural language understanding, as the textual component tends to be shorter, more informal, and often is only understood if combined with the visual context. In this paper, we explore the task of name tagging in multimodal social media posts. We start by creating two new multimodal datasets: one based on Twitter posts1 and the other based on Snapchat captions (exclusively submitted to public and crowd-sourced stories). We then propose a novel model based on Visual Attention that not only provides deeper visual understanding on the decisions of the model, but also significantly outperforms other state-of-the-art baseline methods for this task2

Original languageEnglish (US)
Title of host publicationACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
PublisherAssociation for Computational Linguistics (ACL)
Pages1990-1999
Number of pages10
ISBN (Electronic)9781948087322
DOIs
StatePublished - 2018
Externally publishedYes
Event56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 - Melbourne, Australia
Duration: Jul 15 2018Jul 20 2018

Publication series

NameACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Volume1

Conference

Conference56th Annual Meeting of the Association for Computational Linguistics, ACL 2018
CountryAustralia
CityMelbourne
Period7/15/187/20/18

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Visual attention model for name tagging in multimodal social media'. Together they form a unique fingerprint.

Cite this