ML-AIM

Machine Learning and Artificial Intelligence for Medicine

Prognostication and Risk Factors for Cystic Fibrosis via Automated Machine Learning
Ahmed M. Alaa and Mihaela van der Schaar

An AutoML approach to lung transplant referral!

Accurate prediction of survival for cystic fbrosis (CF) patients is instrumental in establishing the optimal timing for referring patients with terminal respiratory failure for lung transplantation (LT). Current practice considers referring patients for LT evaluation once the forced expiratory volume (FEV1) drops below 30% of its predicted nominal value. While FEV1 is indeed a strong predictor of CF-related mortality, we hypothesized that the survival behavior of CF patients exhibits a lot more heterogeneity. To this end, we developed an algorithmic framework, which we call AutoPrognosis, that leverages the power of machine learning to automate the process of constructing clinical prognostic models, and used it to build a prognostic model for CF using data from a contemporary cohort that involved 99% of the CF population in the UK. AutoPrognosis uses Bayesian optimization techniques to automate the process of confguring ensembles of machine learning pipelines, which involve imputation, feature processing, classifcation and calibration algorithms. Because it is automated, it can be used by clinical researchers to build prognostic models without the need for in depth knowledge of machine learning. Our experiments revealed that the accuracy of the model learned by AutoPrognosis is superior to that of existing guidelines and other competing models.

Read our paper!

The AutoPrognosis system

The core component of AUTOPROGNOSIS is an algorithm that automatically configures ML pipelines, where every pipeline comprises algorithms for missing data imputation, feature preprocessing, prediction, and calibration. The total number of hyperparameters in AUTOPROGNOSIS is 106. We use Bayesian Optimization (BO) to configure the ML pipelines.


Improved performance and new risk factors!

We applied our general framework to the problem of predicting short-term survival of cystic fbrosis patients using data from the UK CF registry. AutoPrognosis was capable of learning an ensemble of machine learning models (including the well-known random forest and XGBoost algorithms) that outperformed existing risk scores developed in the clinical literature, mainstream practice guidelines, and naive implementation of vanilla machine learning models. We demonstrated the clinical utility of the prognostic model learned by AutoPrognosis by examining its potential impact on lung transplant referral decisions. Our analysis showed that the model learned by AutoPrognosis achieves signifcant gains in terms of a wide variety of diagnostic accuracy metrics. Most notably, AutoPrognosis achieves signifcant gains in terms of the positive predictive values, which implies a remarkable improvement in terms of the precision of lung transplant referral decisions. AutoPrognosis' interpreter module revealed that the model is able to achieve such gains because it recognizes the importance of variables that refect disorders in pulmonary gas exchange (such as Oxygenation), and learns their interactions with spirometric biomarkers refecting airway obstruction (such as FEV1). Tis gave rise to a precise survival prediction rule which disentangles patients who are truly at risk from those who do not necessarily need a transplant in the short term.

Check out our online CF mortality risk calculator!