TY - JOUR
T1 - Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models
AU - Zhou, Andy
AU - Yan, Kai
AU - Shlapentokh-Rothman, Michal
AU - Wang, Haohan
AU - Wang, Yu Xiong
N1 - We thank Daniel Campos for useful feedback on earlier versions of this paper.This work was supported in part by NSF Grant 2106825, NIFA Award 2020-67021-32799, the Jump ARCHES endowment through the Health Care Engineering Systems Center at Illinois and the OSF Foundation, and the IBM-Illinois Discovery Accelerator Institute.This work used NVIDIA GPUs at NCSA Delta through allocations CIS220014, CIS230012, and CIS230218 from the ACCESS program.
PY - 2024
Y1 - 2024
N2 - While language models (LMs) have shown potential across a range of decision-making tasks, their reliance on simple acting processes limits their broad deployment as autonomous agents.In this paper, we introduce Language Agent Tree Search (LATS) - the first general framework that synergizes the capabilities of LMs in reasoning, acting, and planning.By leveraging the in-context learning ability of LMs, we integrate Monte Carlo Tree Search into LATS to enable LMs as agents, along with LM-powered value functions and self-reflections for proficient exploration and enhanced decision-making.A key feature of our approach is the incorporation of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that surpasses the constraints of existing techniques.Our experimental evaluation across diverse domains, including programming, interactive question-answering (QA), web navigation, and math, validates the effectiveness and generality of LATS in decision-making while maintaining competitive or improved reasoning performance.Notably, LATS achieves state-of-the-art pass@1 accuracy (92.7%) for programming on HumanEval with GPT-4 and demonstrates gradient-free performance (average score of 75.9) comparable to gradient-based fine-tuning for web navigation on WebShop with GPT-3.5.Code can be found at https://github.com/lapisrocks/LanguageAgentTreeSearch.
AB - While language models (LMs) have shown potential across a range of decision-making tasks, their reliance on simple acting processes limits their broad deployment as autonomous agents.In this paper, we introduce Language Agent Tree Search (LATS) - the first general framework that synergizes the capabilities of LMs in reasoning, acting, and planning.By leveraging the in-context learning ability of LMs, we integrate Monte Carlo Tree Search into LATS to enable LMs as agents, along with LM-powered value functions and self-reflections for proficient exploration and enhanced decision-making.A key feature of our approach is the incorporation of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that surpasses the constraints of existing techniques.Our experimental evaluation across diverse domains, including programming, interactive question-answering (QA), web navigation, and math, validates the effectiveness and generality of LATS in decision-making while maintaining competitive or improved reasoning performance.Notably, LATS achieves state-of-the-art pass@1 accuracy (92.7%) for programming on HumanEval with GPT-4 and demonstrates gradient-free performance (average score of 75.9) comparable to gradient-based fine-tuning for web navigation on WebShop with GPT-3.5.Code can be found at https://github.com/lapisrocks/LanguageAgentTreeSearch.
UR - http://www.scopus.com/inward/record.url?scp=85203841699&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85203841699&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85203841699
SN - 2640-3498
VL - 235
SP - 62138
EP - 62160
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 41st International Conference on Machine Learning, ICML 2024
Y2 - 21 July 2024 through 27 July 2024
ER -