TY - GEN
T1 - WebWISE
T2 - 2024 Findings of the Association for Computational Linguistics: NAACL 2024
AU - Tao, Heyi
AU - Sethuraman, T. V.
AU - Shlapentokh-Rothman, Michal
AU - Gupta, Tanmay
AU - Ji, Heng
AU - Hoiem, Derek
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - This paper investigates using Large Language Models (LLMs) to automatically perform web software tasks using click, scroll, and text input operations. Previous approaches, such as reinforcement learning (RL) or imitation learning, are inefficient to train and task-specific. Our method uses filtered Document Object Model (DOM) elements as observations and performs tasks step-by-step, sequentially generating small programs based on the current observations. We use in-context learning, either benefiting from a single manually provided example, or an automatically generated example based on a successful zero-shot trial. We evaluate our proposed method on the MiniWob++ benchmark. With only one in-context example, our WebWISE method using gpt-3.5-turbo achieves similar or better performance than other methods that require many demonstrations or trials.
AB - This paper investigates using Large Language Models (LLMs) to automatically perform web software tasks using click, scroll, and text input operations. Previous approaches, such as reinforcement learning (RL) or imitation learning, are inefficient to train and task-specific. Our method uses filtered Document Object Model (DOM) elements as observations and performs tasks step-by-step, sequentially generating small programs based on the current observations. We use in-context learning, either benefiting from a single manually provided example, or an automatically generated example based on a successful zero-shot trial. We evaluate our proposed method on the MiniWob++ benchmark. With only one in-context example, our WebWISE method using gpt-3.5-turbo achieves similar or better performance than other methods that require many demonstrations or trials.
UR - http://www.scopus.com/inward/record.url?scp=85197938126&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197938126&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85197938126
T3 - Findings of the Association for Computational Linguistics: NAACL 2024 - Findings
SP - 3693
EP - 3711
BT - Findings of the Association for Computational Linguistics
A2 - Duh, Kevin
A2 - Gomez, Helena
A2 - Bethard, Steven
PB - Association for Computational Linguistics (ACL)
Y2 - 16 June 2024 through 21 June 2024
ER -