TY - GEN
T1 - “Slow Service” → “Great Food”
T2 - 15th International Natural Language Generation Conference, INLG 2022
AU - Zhu, Wanzheng
AU - Bhat, Suma
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Text style transfer aims to change the style (e.g., sentiment, politeness) of a sentence while preserving its content. A common solution is the prototype editing approach, where stylistic tokens are deleted in the “mask” stage and then the masked sentences are infilled with the target style tokens in the “infill” stage. Despite their success, these approaches still suffer from the content preservation problem. By closely inspecting the results of existing approaches, we identify two common types of errors: 1) many content-related tokens are masked and 2) irrelevant words associated with the target style are infilled. Our paper aims to enhance content preservation by tackling each of them. In the “mask” stage, we utilize a BERT-based keyword extraction model that incorporates syntactic information to prevent content-related tokens from being masked. In the “infill” stage, we create a pseudo-parallel dataset and train a T5 model to infill the masked sentences without introducing irrelevant content. Empirical results show that our method outperforms the state-of-the-art baselines in terms of content preservation, while maintaining comparable transfer effectiveness and language quality.
AB - Text style transfer aims to change the style (e.g., sentiment, politeness) of a sentence while preserving its content. A common solution is the prototype editing approach, where stylistic tokens are deleted in the “mask” stage and then the masked sentences are infilled with the target style tokens in the “infill” stage. Despite their success, these approaches still suffer from the content preservation problem. By closely inspecting the results of existing approaches, we identify two common types of errors: 1) many content-related tokens are masked and 2) irrelevant words associated with the target style are infilled. Our paper aims to enhance content preservation by tackling each of them. In the “mask” stage, we utilize a BERT-based keyword extraction model that incorporates syntactic information to prevent content-related tokens from being masked. In the “infill” stage, we create a pseudo-parallel dataset and train a T5 model to infill the masked sentences without introducing irrelevant content. Empirical results show that our method outperforms the state-of-the-art baselines in terms of content preservation, while maintaining comparable transfer effectiveness and language quality.
UR - http://www.scopus.com/inward/record.url?scp=85180362992&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85180362992&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85180362992
T3 - 15th International Natural Language Generation Conference, INLG 2022
SP - 29
EP - 39
BT - 15th International Natural Language Generation Conference, INLG 2022
A2 - Shaikh, Samira
A2 - Ferreira, Thiago Castro
A2 - Stent, Amanda
PB - Association for Computational Linguistics (ACL)
Y2 - 18 July 2022 through 22 July 2022
ER -