TY - GEN
T1 - PATTON
T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
AU - Jin, Bowen
AU - Zhang, Wentao
AU - Zhang, Yu
AU - Meng, Yu
AU - Zhang, Xinyang
AU - Zhu, Qi
AU - Han, Jiawei
N1 - In this work, we introduce PATTON, a method to pretrain language models on text-rich networks. PATTON consists of two objectives: (1) a network-contextualized MLM pretraining objective and (2) a masked node prediction objective, to capture the rich semantics information hidden inside the complex network structure. We conduct experiment on four downstream tasks and five datasets from two different domains, where PATTON outperforms baselines significantly and consistently. Acknowledgments We thank anonymous reviewers for their valuable and insightful feedback. Research was supported in part by US DARPA KAIROS Program No. FA8750-19-2-1004 and INCAS Program No. HR001121C0165, National Science Foundation IIS-19-56151, IIS-17-41317, and IIS 17-04532, and the Molecule Maker Lab Institute: An AI Research Institutes program supported by NSF under Award No. 2019897, and the Institute for Geospa-tial Understanding through an Integrative Discovery Environment (I-GUIDE) by NSF under Award No. 2118329. Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily represent the views, either expressed or implied, of DARPA or the U.S. Government. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agencies.
PY - 2023
Y1 - 2023
N2 - A real-world text corpus sometimes comprises not only text documents, but also semantic links between them (e.g., academic papers in a bibliographic network are linked by citations and co-authorships). Text documents and semantic connections form a text-rich network, which empowers a wide range of downstream tasks such as classification and retrieval. However, pretraining methods for such structures are still lacking, making it difficult to build one generic model that can be adapted to various tasks on text-rich networks. Current pretraining objectives, such as masked language modeling, purely model texts and do not take inter-document structure information into consideration. To this end, we propose our PretrAining on TexT-Rich NetwOrk framework PATTON. PATTON includes two pretraining strategies: network-contextualized masked language modeling and masked node prediction, to capture the inherent dependency between textual attributes and network structure. We conduct experiments on four downstream tasks in five datasets from both academic and e-commerce domains, where PATTON outperforms baselines significantly and consistently.
AB - A real-world text corpus sometimes comprises not only text documents, but also semantic links between them (e.g., academic papers in a bibliographic network are linked by citations and co-authorships). Text documents and semantic connections form a text-rich network, which empowers a wide range of downstream tasks such as classification and retrieval. However, pretraining methods for such structures are still lacking, making it difficult to build one generic model that can be adapted to various tasks on text-rich networks. Current pretraining objectives, such as masked language modeling, purely model texts and do not take inter-document structure information into consideration. To this end, we propose our PretrAining on TexT-Rich NetwOrk framework PATTON. PATTON includes two pretraining strategies: network-contextualized masked language modeling and masked node prediction, to capture the inherent dependency between textual attributes and network structure. We conduct experiments on four downstream tasks in five datasets from both academic and e-commerce domains, where PATTON outperforms baselines significantly and consistently.
UR - https://www.scopus.com/pages/publications/85165692330
UR - https://www.scopus.com/pages/publications/85165692330#tab=citedBy
U2 - 10.18653/v1/2023.acl-long.387
DO - 10.18653/v1/2023.acl-long.387
M3 - Conference contribution
AN - SCOPUS:85165692330
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 7005
EP - 7020
BT - Long Papers
PB - Association for Computational Linguistics (ACL)
Y2 - 9 July 2023 through 14 July 2023
ER -