On hybrid tree-based methods for short-term insurance claims

Zhiyu Quan, Zhiguo Wang, Guojun Gan, Emiliano A. Valdez

Research output: Contribution to journalArticlepeer-review


Two-part framework and the Tweedie generalized linear model (GLM) have traditionally been used to model loss costs for short-term insurance contracts. For most portfolios of insurance claims, there is typically a large proportion of zero claims that leads to imbalances, resulting in lower prediction accuracy of these traditional approaches. In this article, we propose the use of tree-based methods with a hybrid structure that involves a two-step algorithm as an alternative approach. For example, the first step is the construction of a classification tree to build the probability model for claim frequency. The second step is the application of elastic net regression models at each terminal node from the classification tree to build the distribution models for claim severity. This hybrid structure captures the benefits of tuning hyperparameters at each step of the algorithm; this allows for improved prediction accuracy, and tuning can be performed to meet specific business objectives. An obvious major advantage of this hybrid structure is improved model interpretability. We examine and compare the predictive performance of this hybrid structure relative to the traditional Tweedie GLM using both simulated and real datasets. Our empirical results show that these hybrid tree-based methods produce more accurate and informative predictions.
Original languageEnglish (US)
Pages (from-to)597-620
JournalProbability in the Engineering and Informational Sciences
Issue number2
StatePublished - Apr 2023


  • Hyperparameter tuning
  • Tweedie generalized linear model
  • Tree-based models
  • Regularized regression
  • Pure premium


Dive into the research topics of 'On hybrid tree-based methods for short-term insurance claims'. Together they form a unique fingerprint.

Cite this