TY - JOUR
T1 - Improving the performance of machine learning algorithms for health outcomes predictions in multicentric cohorts
AU - IACOV-BR Network
AU - Wichmann, Roberta Moreira
AU - Fernandes, Fernando Timoteo
AU - Chiavegatto Filho, Alexandre Dias Porto
AU - Ciconelle, Ana Claudia Martins
AU - de Brito, Ana Maria Espírito Santo
AU - Nunes, Bruno Pereira
AU - Silva, Dárcia Lima e.
AU - Anschau, Fernando
AU - de Castro Rodrigues, Henrique
AU - Rocha, Hermano Alexandre Lima
AU - dos Reis, João Conrado Bueno
AU - de Oliveira Cavalcante, Liane
AU - de Oliveira, Liszt Palmeira
AU - dos Santos Andrade, Lorena Sofia
AU - Nasi, Luiz Antonio
AU - de Maria Felix, Marcelo
AU - Mimica, Marcelo Jenne
AU - de Almeida Araujo, Maria Elizete
AU - Arnoni, Mariana Volpe
AU - Vianna, Rebeca Baiocchi
AU - Junior, Renan Magalhães Montenegro
AU - da Penha, Renata Vicente
AU - Vicente, Rogério Nadin
AU - de Lima, Ruchelli França
AU - Batista, Sandro Rodrigues
AU - Nunes, Silvia Ferreira
AU - de Macedo, Tássia Teles Santana
AU - Nuno, Valesca Lôbo e.Sant’ana
N1 - This work was supported by National Council for Scientific and Technological Development (CNPq) under Grant Number 402626/2020-6, and Microsoft (Microsoft AI for Health COVID-19 Grant).
PY - 2023
Y1 - 2023
N2 - Machine learning algorithms are being increasingly used in healthcare settings but their generalizability between different regions is still unknown. This study aims to identify the strategy that maximizes the predictive performance of identifying the risk of death by COVID-19 in different regions of a large and unequal country. This is a multicenter cohort study with data collected from patients with a positive RT-PCR test for COVID-19 from March to August 2020 (n = 8477) in 18 hospitals, covering all five Brazilian regions. Of all patients with a positive RT-PCR test during the period, 2356 (28%) died. Eight different strategies were used for training and evaluating the performance of three popular machine learning algorithms (extreme gradient boosting, lightGBM, and catboost). The strategies ranged from only using training data from a single hospital, up to aggregating patients by their geographic regions. The predictive performance of the algorithms was evaluated by the area under the ROC curve (AUROC) on the test set of each hospital. We found that the best overall predictive performances were obtained when using training data from the same hospital, which was the winning strategy for 11 (61%) of the 18 participating hospitals. In this study, the use of more patient data from other regions slightly decreased predictive performance. However, models trained in other hospitals still had acceptable performances and could be a solution while data for a specific hospital is being collected.
AB - Machine learning algorithms are being increasingly used in healthcare settings but their generalizability between different regions is still unknown. This study aims to identify the strategy that maximizes the predictive performance of identifying the risk of death by COVID-19 in different regions of a large and unequal country. This is a multicenter cohort study with data collected from patients with a positive RT-PCR test for COVID-19 from March to August 2020 (n = 8477) in 18 hospitals, covering all five Brazilian regions. Of all patients with a positive RT-PCR test during the period, 2356 (28%) died. Eight different strategies were used for training and evaluating the performance of three popular machine learning algorithms (extreme gradient boosting, lightGBM, and catboost). The strategies ranged from only using training data from a single hospital, up to aggregating patients by their geographic regions. The predictive performance of the algorithms was evaluated by the area under the ROC curve (AUROC) on the test set of each hospital. We found that the best overall predictive performances were obtained when using training data from the same hospital, which was the winning strategy for 11 (61%) of the 18 participating hospitals. In this study, the use of more patient data from other regions slightly decreased predictive performance. However, models trained in other hospitals still had acceptable performances and could be a solution while data for a specific hospital is being collected.
UR - http://www.scopus.com/inward/record.url?scp=85146569069&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146569069&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-26467-6
DO - 10.1038/s41598-022-26467-6
M3 - Article
C2 - 36658181
AN - SCOPUS:85146569069
SN - 2045-2322
VL - 13
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 1022
ER -