Discovering hypernymy in text-rich heterogeneous information network by exploiting context granularity

Yu Shi, Jiaming Shen, Yuchen Li, Naijing Zhang, Xinwei He, Zhengzhi Lou, Qi Zhu, Matthew Walker, Myunghwan Kim, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Text-rich heterogeneous information networks (text-rich HINs) are ubiquitous in real-world applications. Hypernymy, also known as is-a relation or subclass-of relation, lays in the core of many knowledge graphs and benefits many downstream applications. Existing methods of hypernymy discovery either leverage textual patterns to extract explicitly mentioned hypernym-hyponym pairs, or learn a distributional representation for each term of interest based its context. These approaches rely on statistical signals from the textual corpus, and their effectiveness would therefore be hindered when the signals from the corpus are not sufficient for all terms of interest. In this work, we propose to discover hypernymy in text-rich HINs, which can introduce additional high-quality signals. We develop a new framework, named HyperMine, that exploits multi-granular contexts and combines signals from both text and network without human labeled data. HyperMine extends the definition of “context” to the scenario of text-rich HIN. For example, we can define typed nodes and communities as contexts. These contexts encode signals of different granularities and we feed them into a hypernymy inference model. HyperMine learns this model using weak supervision acquired based on high-precision textual patterns. Extensive experiments on two large real-world datasets demonstrate the effectiveness of HyperMine and the utility of modeling context granularity. We further show a case study that a high-quality taxonomy can be generated solely based on the hypernymy discovered by HyperMine.

Original languageEnglish (US)
Title of host publicationCIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages599-608
Number of pages10
ISBN (Electronic)9781450369763
DOIs
StatePublished - Nov 3 2019
Event28th ACM International Conference on Information and Knowledge Management, CIKM 2019 - Beijing, China
Duration: Nov 3 2019Nov 7 2019

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference28th ACM International Conference on Information and Knowledge Management, CIKM 2019
Country/TerritoryChina
CityBeijing
Period11/3/1911/7/19

Keywords

  • Distributional Inclusion Hypothesis
  • Heterogeneous Information Network
  • Hypernymy Discovery
  • Text-rich Network

ASJC Scopus subject areas

  • General Business, Management and Accounting
  • General Decision Sciences

Fingerprint

Dive into the research topics of 'Discovering hypernymy in text-rich heterogeneous information network by exploiting context granularity'. Together they form a unique fingerprint.

Cite this