TY - GEN
T1 - Analogy Generation by Prompting Large Language Models
T2 - 15th International Natural Language Generation Conference, INLG 2022
AU - Bhavya, Bhavya
AU - Xiong, Jinjun
AU - Zhai, Cheng Xiang
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - We propose a novel application of prompting Pre-trained Language Models (PLMs) to generate analogies and study how to design effective prompts for two task settings: generating a source concept analogous to a given target concept (aka Analogous Concept Generation or ACG), and generating an explanation of the similarity between a given pair of target concept and source concept (aka Analogous Explanation Generation or AEG). We found that it is feasible to prompt InstructGPT to generate meaningful analogies and the best prompts tend to be precise imperative statements especially with a low temperature setting. We also systematically analyzed the sensitivity of the InstructGPT model to prompt design, temperature, and injected spelling errors, and found that the model is particularly sensitive to certain variations (e.g., questions vs. imperative statements). Further, we conducted human evaluation on 1.4k of the generated analogies and found that the quality of generations varies substantially by model size. The largest InstructGPT model can achieve human-level performance at generating meaningful analogies for a given target while there is still room for improvement on the AEG task.
AB - We propose a novel application of prompting Pre-trained Language Models (PLMs) to generate analogies and study how to design effective prompts for two task settings: generating a source concept analogous to a given target concept (aka Analogous Concept Generation or ACG), and generating an explanation of the similarity between a given pair of target concept and source concept (aka Analogous Explanation Generation or AEG). We found that it is feasible to prompt InstructGPT to generate meaningful analogies and the best prompts tend to be precise imperative statements especially with a low temperature setting. We also systematically analyzed the sensitivity of the InstructGPT model to prompt design, temperature, and injected spelling errors, and found that the model is particularly sensitive to certain variations (e.g., questions vs. imperative statements). Further, we conducted human evaluation on 1.4k of the generated analogies and found that the quality of generations varies substantially by model size. The largest InstructGPT model can achieve human-level performance at generating meaningful analogies for a given target while there is still room for improvement on the AEG task.
UR - http://www.scopus.com/inward/record.url?scp=85151290089&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85151290089&partnerID=8YFLogxK
U2 - 10.18653/v1/2022.inlg-main.25
DO - 10.18653/v1/2022.inlg-main.25
M3 - Conference contribution
AN - SCOPUS:85151290089
T3 - 15th International Natural Language Generation Conference, INLG 2022
SP - 298
EP - 312
BT - 15th International Natural Language Generation Conference, INLG 2022
A2 - Shaikh, Samira
A2 - Ferreira, Thiago Castro
A2 - Stent, Amanda
PB - Association for Computational Linguistics (ACL)
Y2 - 18 July 2022 through 22 July 2022
ER -