Geometry of polysemy

Research output: Contribution to conferencePaper

Abstract

Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.

Original languageEnglish (US)
StatePublished - Jan 1 2019
Event5th International Conference on Learning Representations, ICLR 2017 - Toulon, France
Duration: Apr 24 2017Apr 26 2017

Conference

Conference5th International Conference on Learning Representations, ICLR 2017
CountryFrance
CityToulon
Period4/24/174/26/17

Fingerprint

mathematics
Geometry
Vector spaces
induction
Clustering algorithms
Labels
Wikipedia
Polysemy
present
Disambiguation
Induction

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Cite this

Mu, J., Bhat, S. P., & Viswanath, P. (2019). Geometry of polysemy. Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France.

Geometry of polysemy. / Mu, Jiaqi; Bhat, Suma Pallathadka; Viswanath, Pramod.

2019. Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France.

Research output: Contribution to conferencePaper

Mu, J, Bhat, SP & Viswanath, P 2019, 'Geometry of polysemy' Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 4/24/17 - 4/26/17, .
Mu J, Bhat SP, Viswanath P. Geometry of polysemy. 2019. Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France.
Mu, Jiaqi ; Bhat, Suma Pallathadka ; Viswanath, Pramod. / Geometry of polysemy. Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France.
@conference{1fe28d9fd76f41d38053c132a449142b,
title = "Geometry of polysemy",
abstract = "Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.",
author = "Jiaqi Mu and Bhat, {Suma Pallathadka} and Pramod Viswanath",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
note = "5th International Conference on Learning Representations, ICLR 2017 ; Conference date: 24-04-2017 Through 26-04-2017",

}

TY - CONF

T1 - Geometry of polysemy

AU - Mu, Jiaqi

AU - Bhat, Suma Pallathadka

AU - Viswanath, Pramod

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.

AB - Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.

UR - http://www.scopus.com/inward/record.url?scp=85070976752&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070976752&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85070976752

ER -