Hypergraph of Text: A Mathematical Structure for Organizing and Analyzing Big Text Data

Dean E. Alvarez, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Since the collective knowledge of our world is primarily encoded in massive amounts of text data, people rely on text data to get access to all kinds of useful knowledge. However, how to organize, navigate, and analyze large amounts of text data remains a difficult open challenge. To address this challenge, we propose the Hypergraph of Text (HoT), a mathematical structure for organizing and analyzing big text data. We discuss how to create HoT from large text collections and various applications of HoT. Experimentally, we show the promise of HoT by creating a HoT on a subset of Wikipedia pages covering topics in philosophy. Experiment results show the structure created by Hot has many uses such as facilitating information access via enabling flexible corpus navigation and discovering interesting topical structures.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE International Conference on Big Data, BigData 2024
EditorsWei Ding, Chang-Tien Lu, Fusheng Wang, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8605-8607
Number of pages3
ISBN (Electronic)9798350362480
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Big Data, BigData 2024 - Washington, United States
Duration: Dec 15 2024Dec 18 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Big Data, BigData 2024

Conference

Conference2024 IEEE International Conference on Big Data, BigData 2024
Country/TerritoryUnited States
CityWashington
Period12/15/2412/18/24

Keywords

  • Big Text Data
  • Hypergraph
  • Text Organization
  • Topic Analysis

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Hypergraph of Text: A Mathematical Structure for Organizing and Analyzing Big Text Data'. Together they form a unique fingerprint.

Cite this