Title: Developing Machine Learning Models to Predict Multi-Class Functional Outcomes and Death 3 months after Stroke in Sweden
Importance: Globally, stroke is the third-leading cause of mortality and disability combined. Due to the personal suffering of post-stroke disability and the economic burden for the community, accurate prediction of outcomes could provide guidance for the continued care and rehabilitation planning. Machine learning algorithms have recently shown potential in predicting outcomes after stroke.
Objective: To develop and compare the performance of three supervised machine learning algorithms with the traditional logistic regression model in predicting disability and death 3 months after stroke based on the modified Rankin Scale (mRS) using routinely collected. A secondary aim was to explore the explainability of these algorithms by revealing the most important variables and how they contribute to the prediction.
Design, Setting, Participants: This prognostic study includes data from the Swedish Stroke Register (Riksstroke), containing information on the entire chain of acute care among patients admitted to all 72 hospitals caring for stroke patients in Sweden. Patient reported outcomes, including functional status, are collected by a questionnaire 3 months after stroke. Data on 102135 adult patients, recorded between the period January 2015 to December 2020 were included in the analyses.
Exposures: Prognostic factors (features) comprised amongst others age, sex, cardiovascular risk factors, medications, mRS prior to stroke, National Institutes of Health Stroke Scale (NIHSS), and type of stroke. Imputation of missing NIHSS values was done based on the MICE technique, and a separate category was created for missing values in other features. To improve the model’s prediction performances, feature scaling and label encoding were carried out using Min-Max and one-hot encoding methods, respectively.
Main Outcomes and Measures: The main outcome for prediction was mRS measured at 3 months after stroke, and categorized into 3 levels (0-2 independent, 3-5 dependent, and 6 dead). Classifiers included support vector machines (SVM), artificial neural networks (ANN), eXtreme Gradient Boosting (XGBoost), and logistic regression (LR). They were trained and tested on 75% and 25% of the dataset, respectively, their predictive performances assessed and compared based on accuracy scores, Matthews correlation coefficient (MCC), Cohen’s Kappa correlation coefficient, F1 scores, and area under the receiver operating characteristic curve (AUC-ROC). Lastly, the predictions were explained using SHAPley Additive exPlanations (SHAP) values.
Results: In total, 85.8% had ischemic stroke and 53.3% were male. The mean [SD] age at admission was 75.8 [12.0] years with NIHSS score median [Q1-Q3] of 3 [1-8]. The ANN and XGBoost classifiers performed significantly better than the traditional LR in classifying the correct mRS levels, respectively with an accuracy of 0.698 (95%CI 0.693-0.704) and 0.694 (95%CI 0.688-0.699), compared to 0.681 (95%CI 0.675, 0.686) for the LR model. The results also showed that death after stroke was most strongly associated with NIHSS, higher age, hemorrhagic stroke, prior stroke mRS, and being inpatient at time of stroke. Whereas, independence in functional outcome was related to male sex, stroke alerts, and lipid lowering drugs.
Conclusions and Relevance: The study demonstrated that both ANN and XGBoost classifiers have significantly better performances than the traditional LR in predicting functional outcome and death. On average, for every 10000 stroke patients, an additional of 170 patients would be correctly classified into different mRS categories using machine learning algorithms instead of LR. This could be clinically important in acute stroke care and rehabilitation planning. Existing methods (e.g SHAP) can be used for the interpretability of these advanced algorithms. The models showed promising results, however they need to be externally validated for generalizability.