Abstract
Alternative lending is a vital source of credit for consumers underserved by traditional banks. This study examines how integrating additional data and advanced machine learning enhances default prediction in this sector. We merge loan records with credit bureau data and compare four variable sets: credit scores alone; loan-specific variables alone; a combination of credit scores and loan variables; and an integration of credit scores, loan variables and more than 300 credit bureau variables selected via least absolute shrinkage and selection operator (Lasso) regression. Our findings show that credit scores alone yield limited accuracy (with an area under the curve (AUC) of 0.6), while incorporating loan-specific features significantly improves performance. Further including selected credit bureau variables and tuning hyperparameters boosts predictive power, with a random forest model achieving an AUC of 0.854. Key predictors include credit scores, the loan amount, loan duration, months since the oldest trade, and recent credit inquiries. These results underscore the importance of comprehensive credit bureau data and rigorous model validation in alternative lending, offering practical insights for lenders and policy makers seeking to refine credit risk assessment.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 33-59 |
| Number of pages | 27 |
| Journal | Journal of Risk Model Validation |
| Volume | 19 |
| Issue number | 1 |
| Early online date | 2025 |
| DOIs | |
| State | Published - 2025 |
Keywords
- alternative lending
- credit risk model
- default prediction
- machine learning methods
- payday loans
- risk model validation
ASJC Scopus subject areas
- Modeling and Simulation
- Finance
- Economics and Econometrics
- Applied Mathematics