Identifying Patients at Risk of Delay in Starting Cancer Treatment Using Multi-Level Machine Learning Models
Key Highlights :
Cancer is a leading cause of death worldwide and timely treatment initiation is essential for improving clinical outcomes. However, socioeconomically disadvantaged patients and those living in low-resource neighborhoods often experience delays in treatment initiation after diagnosis, which significantly affects their clinical outcomes. To address this issue, researchers have developed multi-level machine learning models that incorporate clinical and demographic data of cancer patients and neighborhood-level social determinants of health data to identify patients at an increased risk of delay in starting cancer treatment.
A study published in JAMA Network Open evaluated the predictive efficacy of four different machine learning models, including group least absolute shrinkage and selection operator, Bayesian additive regression tree, gradient boosting, and random forest. The study included adult patients with breast, lung, colorectal, bladder, or kidney cancer who were diagnosed between 2013 and 2019 and subsequently treated at Fox Chase Cancer Center in Philadelphia. Patient data related to cancer diagnosis-first treatment interval, health and demographic characteristics including race, ethnicity, laboratory findings, and comorbidities, as well as neighborhood-level health variables, were incorporated into the machine learning models.
The group least absolute shrinkage and selection operator (LASSO) model was selected as the optimal machine learning model based on its discrimination, calibration, and interoperability. The model predicted that patients were less likely to experience a delay if they were diagnosed at the treating center, had the index cancer as their first malignant neoplasm, were Asian or Pacific Islander or White, had private insurance, or had late-stage disease. In contrast, patients with certain comorbidities or increased creatinine levels were more likely to experience a delay. Regarding neighborhood-level social determinants, the model predicted that patients belonging to the most socioeconomically deprived areas were more likely to experience a delay as compared to those belonging to the least socioeconomically deprived areas.
The study results demonstrate that multi-level machine learning models can effectively identify cancer patients who are at a greater risk of experiencing treatment delays of more than 60 days after their initial cancer diagnosis. However, the model showed lower predictive effectiveness in vulnerable populations, such as Black patients and those residing in the most deprived areas. Future studies should include a higher proportion of vulnerable populations and more relevant social variables to improve the model performance.