TY - GEN
T1 - Multi-view Graph-Based Text Representations for Imbalanced Classification
AU - Karajeh, Ola
AU - Lourentzou, Ismini
AU - Fox, Edward A.
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Text classification is a fundamental task in natural language processing, notably in the context of digital libraries, where it is essential for organizing and retrieving large numbers of documents in diverse collections, especially when tackling issues with inherent class imbalance. Sequence-based models can successfully capture semantics in local consecutive text sequences. On the other hand, graph-based models can preserve global co-occurrences that capture non-consecutive and long-distance semantics. A text representation approach that combines local and global information can enhance performance in practical class imbalance text classification scenarios. Yet, multi-view graph-based text representations have received limited attention. In this work, we introduce Multi-view Minority Class Text Graph Convolutional Network (MMCT-GCN), a transductive multi-view text classification model that captures textual graph representations for the minority class, along with sequence-based text representations. Experiments show that MMCT-GCN variants outperform baseline models on multiple text collections.
AB - Text classification is a fundamental task in natural language processing, notably in the context of digital libraries, where it is essential for organizing and retrieving large numbers of documents in diverse collections, especially when tackling issues with inherent class imbalance. Sequence-based models can successfully capture semantics in local consecutive text sequences. On the other hand, graph-based models can preserve global co-occurrences that capture non-consecutive and long-distance semantics. A text representation approach that combines local and global information can enhance performance in practical class imbalance text classification scenarios. Yet, multi-view graph-based text representations have received limited attention. In this work, we introduce Multi-view Minority Class Text Graph Convolutional Network (MMCT-GCN), a transductive multi-view text classification model that captures textual graph representations for the minority class, along with sequence-based text representations. Experiments show that MMCT-GCN variants outperform baseline models on multiple text collections.
KW - Graph Convolutional Networks
KW - Imbalanced Data
KW - Text Classification
UR - http://www.scopus.com/inward/record.url?scp=85174595082&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174595082&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-43849-3_22
DO - 10.1007/978-3-031-43849-3_22
M3 - Conference contribution
AN - SCOPUS:85174595082
SN - 9783031438486
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 249
EP - 264
BT - Linking Theory and Practice of Digital Libraries - 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023, Proceedings
A2 - Alonso, Omar
A2 - Cousijn, Helena
A2 - Silvello, Gianmaria
A2 - Marchesin, Stefano
A2 - Marrero, Mónica
A2 - Teixeira Lopes, Carla
PB - Springer
T2 - 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023
Y2 - 26 September 2023 through 29 September 2023
ER -