TY - CONF
T1 - Unsupervised Multi-Granularity Summarization
AU - Zhong, Ming
AU - Liu, Yang
AU - Ge, Suyu
AU - Mao, Yuning
AU - Jiao, Yizhu
AU - Zhang, Xingxing
AU - Xu, Yichong
AU - Zhu, Chenguang
AU - Zeng, Michael
AU - Han, Jiawei
N1 - We thank Wen Xiao for providing the output of PRIMERA. We would also like to thank anonymous reviewers for valuable comments and suggestions. Research was supported in part by US DARPA KAIROS Program No. FA8750-19-2-1004 and INCAS Program No. HR001121C0165, National Science Foundation IIS-19-56151, IIS-17-41317, and IIS 17-04532, and the Molecule Maker Lab Institute: An AI Research Institutes program supported by NSF under Award No. 2019897, and the Institute for Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) by NSF under Award No. 2118329. Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily represent the views, either expressed or implied, of DARPA or the U.S. Government. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agencies.
PY - 2022
Y1 - 2022
N2 - Text summarization is a user-preference based task, i.e., for one document, users often have different priorities for summary. As a key aspect of customization in summarization, granularity is used to measure the semantic coverage between summary and source document. However, developing systems that can generate summaries with customizable semantic coverage is still an under-explored topic. In this paper, we propose the first unsupervised multi-granularity summarization framework, GRANUSUM. We take events as the basic semantic units of the source documents and propose to rank these events by their salience. We also develop a model to summarize input documents with given events as anchors and hints. By inputting different numbers of events, GRANUSUM is capable of producing multi-granular summaries in an unsupervised manner. Meanwhile, we annotate a new benchmark GranuDUC that contains multiple summaries at different granularities for each document cluster. Experimental results confirm the substantial superiority of GRANUSUM on multi-granularity summarization over strong baselines. Furthermore, by exploiting the event information, GRANUSUM also exhibits state-of-the-art performance under conventional unsupervised abstractive setting.
AB - Text summarization is a user-preference based task, i.e., for one document, users often have different priorities for summary. As a key aspect of customization in summarization, granularity is used to measure the semantic coverage between summary and source document. However, developing systems that can generate summaries with customizable semantic coverage is still an under-explored topic. In this paper, we propose the first unsupervised multi-granularity summarization framework, GRANUSUM. We take events as the basic semantic units of the source documents and propose to rank these events by their salience. We also develop a model to summarize input documents with given events as anchors and hints. By inputting different numbers of events, GRANUSUM is capable of producing multi-granular summaries in an unsupervised manner. Meanwhile, we annotate a new benchmark GranuDUC that contains multiple summaries at different granularities for each document cluster. Experimental results confirm the substantial superiority of GRANUSUM on multi-granularity summarization over strong baselines. Furthermore, by exploiting the event information, GRANUSUM also exhibits state-of-the-art performance under conventional unsupervised abstractive setting.
UR - http://www.scopus.com/inward/record.url?scp=85149835289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149835289&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85149835289
SP - 5009
EP - 5024
T2 - 2022 Findings of the Association for Computational Linguistics: EMNLP 2022
Y2 - 7 December 2022 through 11 December 2022
ER -