TY - GEN
T1 - Instruct, Not Assist
T2 - 2024 Findings of the Association for Computational Linguistics, EMNLP 2024
AU - Kargupta, Priyanka
AU - Agarwal, Ishika
AU - Hakkani-Tur, Dilek
AU - Han, Jiawei
N1 - This research project has benefited from the Microsoft Accelerate Foundation Models Research (AFMR) grant program, through which leading foundation models hosted by Microsoft Azure and access to Azure credits were provided to conduct the research.Furthermore, we would like to thank Mihir Kavishwar, Krish Agarwal, Sonia Agarwal, Nirav Diwan, and Shradha Sehgal for their help and feedback on our work.
PY - 2024
Y1 - 2024
N2 - Socratic questioning is an effective teaching strategy, encouraging critical thinking and problem-solving.The conversational capabilities of large language models (LLMs) show great potential for providing scalable, real-time student guidance.However, current LLMs often give away solutions directly, making them ineffective instructors.We tackle this issue in the code debugging domain with TreeInstruct, an Instructor agent guided by a novel state space-based planning algorithm.TreeInstruct asks probing questions to help students independently identify and resolve errors.It estimates a student's conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting.In addition to using an existing single-bug debugging benchmark, we construct a more challenging multi-bug dataset of 150 coding problems, incorrect solutions, and bug fixes- all carefully constructed and annotated by experts.Extensive evaluation shows TreeInstruct's state-of-the-art performance on both datasets, proving it to be a more effective instructor than baselines.Furthermore, a real-world case study with five students of varying skill levels further demonstrates TreeInstruct's ability to guide students to debug their code efficiently with minimal turns and highly Socratic questioning.
AB - Socratic questioning is an effective teaching strategy, encouraging critical thinking and problem-solving.The conversational capabilities of large language models (LLMs) show great potential for providing scalable, real-time student guidance.However, current LLMs often give away solutions directly, making them ineffective instructors.We tackle this issue in the code debugging domain with TreeInstruct, an Instructor agent guided by a novel state space-based planning algorithm.TreeInstruct asks probing questions to help students independently identify and resolve errors.It estimates a student's conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting.In addition to using an existing single-bug debugging benchmark, we construct a more challenging multi-bug dataset of 150 coding problems, incorrect solutions, and bug fixes- all carefully constructed and annotated by experts.Extensive evaluation shows TreeInstruct's state-of-the-art performance on both datasets, proving it to be a more effective instructor than baselines.Furthermore, a real-world case study with five students of varying skill levels further demonstrates TreeInstruct's ability to guide students to debug their code efficiently with minimal turns and highly Socratic questioning.
UR - http://www.scopus.com/inward/record.url?scp=85217621967&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85217621967&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.findings-emnlp.553
DO - 10.18653/v1/2024.findings-emnlp.553
M3 - Conference contribution
AN - SCOPUS:85217621967
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
SP - 9475
EP - 9495
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
Y2 - 12 November 2024 through 16 November 2024
ER -