Modeling task relationships in multi-task learning with multi-gate mixture-of-experts

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, Ed H. Chi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Neural-based multi-task learning has been successfully used in many real-world large-scale applications such as recommendation systems. For example, in movie recommendations, beyond providing users movies which they tend to purchase and watch, the system might also optimize for users liking the movies afterwards. With multi-task learning, we aim to build a single model that learns these multiple goals and tasks simultaneously. However, the prediction quality of commonly used multi-task models is often sensitive to the relationships between tasks. It is therefore important to study the modeling tradeos between task-specic objectives and inter-task relationships. In this work, we propose a novel multi-task learning approach, Multi-gate Mixture-of-Experts (MMoE), which explicitly learns to model task relationships from data. We adapt the Mixture-of-Experts (MoE) structure to multi-task learning by sharing the expert submodels across all tasks, while also having a gating network trained to optimize each task. To validate our approach on data with dierent levels of task relatedness, we rst apply it to a synthetic dataset where we control the task relatedness. We show that the proposed approach performs better than baseline methods when the tasks are less related. We also show that the MMoE structure results in an additional trainability benet, depending on dierent levels of randomness in the training data and model initialization. Furthermore, we demonstrate the performance improvements by MMoE on real tasks including a binary classication benchmark, and a large-scale content recommendation system at Google.

Original languageEnglish (US)
Title of host publicationKDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1930-1939
Number of pages10
ISBN (Print)9781450355520
DOIs
StatePublished - Jul 19 2018
Externally publishedYes
Event24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018 - London, United Kingdom
Duration: Aug 19 2018Aug 23 2018

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

Other24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
Country/TerritoryUnited Kingdom
CityLondon
Period8/19/188/23/18

Keywords

  • Mixture of experts
  • Multi-task learning
  • Neural network
  • Recommendation system

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Modeling task relationships in multi-task learning with multi-gate mixture-of-experts'. Together they form a unique fingerprint.

Cite this