TY - JOUR
T1 - A machine learning and explainable artificial intelligence triage-prediction system for COVID-19
AU - Khanna, Varada Vivek
AU - Chadaga, Krishnaraj
AU - Sampathila, Niranjana
AU - Prabhu, Srikanth
AU - Rajagopala Chadaga, P.
N1 - Funding Information:
The authors did not receive funding from any sources.
Publisher Copyright:
© 2023 The Author(s)
PY - 2023/6
Y1 - 2023/6
N2 - COVID-19 is a respiratory disease caused by the SARS-CoV-2 contagion, severely disrupted the healthcare infrastructure. Various countries have developed COVID-19 vaccines that have effectively prevented the severe symptoms caused by the virus to a certain extent. However, a small section of people continues to perish. Artificial intelligence advances have revolutionized healthcare diagnosis and prognosis infrastructure. In this study, we predict the severity of COVID-19 using heterogenous Machine Learning and Deep Learning algorithms by considering clinical markers, vital signs, and other critical factors. This study extensively reviews various classifier architectures to predict the COVID-19 severity. We built and evaluated multiple pipelines entailing combinations of five state-of-the-art data-balancing techniques (Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic, Borderline SMOTE, SMOTE with Tomek links, and SMOTE with Edited Nearest Neighbor (ENN)) and twelve heterogeneous classifiers such as Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, Xgboost, Extratrees, Adaboost, Light GBM, Catboost, and 1-D Convolution Neural Network. The best-performing pipeline consists of Random Forest trained on Borderline SMOTE balanced data that produced the highest recall of 83%. We deployed Explainable Artificial Intelligence tools such as Shapley Additive Explanations and Local Interpretable Model-agnostic Explanations, ELI5, Qlattice, Anchor, and Feature Importance to demystify complex tree-based ensemble models. These tools provide valuable insights into the significance of critical features in the severity prediction of a COVID-19 patient. It was observed that changes in respiratory rate, blood pressure, lactate, and calcium values were the primary contributors to the increase in severity of a COVID-19 patient. This architecture aims to be an explainable decision-support triaging system for medical professionals in countries lacking advanced medical technology and infrastructure to reduce fatalities.
AB - COVID-19 is a respiratory disease caused by the SARS-CoV-2 contagion, severely disrupted the healthcare infrastructure. Various countries have developed COVID-19 vaccines that have effectively prevented the severe symptoms caused by the virus to a certain extent. However, a small section of people continues to perish. Artificial intelligence advances have revolutionized healthcare diagnosis and prognosis infrastructure. In this study, we predict the severity of COVID-19 using heterogenous Machine Learning and Deep Learning algorithms by considering clinical markers, vital signs, and other critical factors. This study extensively reviews various classifier architectures to predict the COVID-19 severity. We built and evaluated multiple pipelines entailing combinations of five state-of-the-art data-balancing techniques (Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic, Borderline SMOTE, SMOTE with Tomek links, and SMOTE with Edited Nearest Neighbor (ENN)) and twelve heterogeneous classifiers such as Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, Xgboost, Extratrees, Adaboost, Light GBM, Catboost, and 1-D Convolution Neural Network. The best-performing pipeline consists of Random Forest trained on Borderline SMOTE balanced data that produced the highest recall of 83%. We deployed Explainable Artificial Intelligence tools such as Shapley Additive Explanations and Local Interpretable Model-agnostic Explanations, ELI5, Qlattice, Anchor, and Feature Importance to demystify complex tree-based ensemble models. These tools provide valuable insights into the significance of critical features in the severity prediction of a COVID-19 patient. It was observed that changes in respiratory rate, blood pressure, lactate, and calcium values were the primary contributors to the increase in severity of a COVID-19 patient. This architecture aims to be an explainable decision-support triaging system for medical professionals in countries lacking advanced medical technology and infrastructure to reduce fatalities.
UR - http://www.scopus.com/inward/record.url?scp=85159411122&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85159411122&partnerID=8YFLogxK
U2 - 10.1016/j.dajour.2023.100246
DO - 10.1016/j.dajour.2023.100246
M3 - Article
AN - SCOPUS:85159411122
SN - 2772-6622
VL - 7
JO - Decision Analytics Journal
JF - Decision Analytics Journal
M1 - 100246
ER -