TY - GEN
T1 - MArBLE
T2 - 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
AU - Wahed, Muntasir
AU - Gruhl, Daniel
AU - Lourentzou, Ismini
N1 - This material is based upon work supported by the National Science Foundation under Grant No. 2208864.
PY - 2023/10/21
Y1 - 2023/10/21
N2 - The modern-day research community has an embarrassment of riches regarding pre-trained AI models. Even for a simple task such as lexicon set expansion, where an AI model suggests new entities to add to a predefined seed set of entities, thousands of models are available. However, deciding which model to use for a given set expansion task is non-trivial. In hindsight, some models can be 'off topic' for specific set expansion tasks, while others might work well initially but quickly exhaust what they have to offer. Additionally, certain models may require more careful priming in the form of samples or feedback before being finetuned to the task at hand. In this work, we frame this model selection as a sequential non-stationary problem, where there exist a large number of diverse pre-trained models that may or may not fit a task at hand, and an expert is shown one suggestion at a time to include in the set or not, i.e., accept or reject the suggestion. The goal is to expand the list with the most entities as quickly as possible. We introduce MArBLE, a hierarchical multi-armed bandit method for this task, and two strategies designed to address cold-start problems. Experimental results on three set expansion tasks demonstrate MArBLE's effectiveness compared to baselines.
AB - The modern-day research community has an embarrassment of riches regarding pre-trained AI models. Even for a simple task such as lexicon set expansion, where an AI model suggests new entities to add to a predefined seed set of entities, thousands of models are available. However, deciding which model to use for a given set expansion task is non-trivial. In hindsight, some models can be 'off topic' for specific set expansion tasks, while others might work well initially but quickly exhaust what they have to offer. Additionally, certain models may require more careful priming in the form of samples or feedback before being finetuned to the task at hand. In this work, we frame this model selection as a sequential non-stationary problem, where there exist a large number of diverse pre-trained models that may or may not fit a task at hand, and an expert is shown one suggestion at a time to include in the set or not, i.e., accept or reject the suggestion. The goal is to expand the list with the most entities as quickly as possible. We introduce MArBLE, a hierarchical multi-armed bandit method for this task, and two strategies designed to address cold-start problems. Experimental results on three set expansion tasks demonstrate MArBLE's effectiveness compared to baselines.
KW - Cold-start Problems
KW - Entity Set Expansion
KW - Hierarchical Multi-Armed Bandits
KW - Human-in-the-Loop Set Expansion
KW - Model Selection
UR - http://www.scopus.com/inward/record.url?scp=85178123678&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178123678&partnerID=8YFLogxK
U2 - 10.1145/3583780.3615485
DO - 10.1145/3583780.3615485
M3 - Conference contribution
AN - SCOPUS:85178123678
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4857
EP - 4863
BT - CIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 21 October 2023 through 25 October 2023
ER -