TY - GEN
T1 - To Asymmetry and Beyond
T2 - 4th Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023
AU - Campos, Daniel
AU - Zhai, Cheng Xiang
N1 - Publisher Copyright:
© 2023 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Sequence-to-sequence language models can be used to produce abstractive summaries which are coherent, relevant, and concise. Still, model sizes can make deployment in latency-sensitive or web-scale implementations difficult. This paper studies the relationship between model size, structured pruning, inference efficiency, and summarization accuracy on widely used summarization datasets. We show that model accuracy is tied to the encoder size while inference efficiency is connected to the decoder. Using asymmetric pruning can lead to nearly 3x improvement in inference latency with 1 point loss in Rouge-2. Moreover, we find both the average degradation and the role of asymmetry to be consistent across model sizes and variations in datasets. We release our code1, training regimes, and associated model 2 for broad usage to encourage usage and experimentation.
AB - Sequence-to-sequence language models can be used to produce abstractive summaries which are coherent, relevant, and concise. Still, model sizes can make deployment in latency-sensitive or web-scale implementations difficult. This paper studies the relationship between model size, structured pruning, inference efficiency, and summarization accuracy on widely used summarization datasets. We show that model accuracy is tied to the encoder size while inference efficiency is connected to the decoder. Using asymmetric pruning can lead to nearly 3x improvement in inference latency with 1 point loss in Rouge-2. Moreover, we find both the average degradation and the role of asymmetry to be consistent across model sizes and variations in datasets. We release our code1, training regimes, and associated model 2 for broad usage to encourage usage and experimentation.
UR - http://www.scopus.com/inward/record.url?scp=85175789213&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175789213&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.sustainlp-1.6
DO - 10.18653/v1/2023.sustainlp-1.6
M3 - Conference contribution
AN - SCOPUS:85175789213
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 91
EP - 109
BT - 4th Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023 - Proceedings of the Workshop
A2 - Moosavi, Nafise Sadat
A2 - Gurevych, Iryna
A2 - Hou, Yufang
A2 - Kim, Gyuwan
A2 - Young, Jin Kim
A2 - Schuster, Tal
A2 - Agrawal, Ameeta
PB - Association for Computational Linguistics (ACL)
Y2 - 13 July 2023
ER -