Coronavirus infection (COVID-19) is a dangerous disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that has quickly spread all around the world, becoming a global pandemic on 11th March 2020. Vaccines have been developed to prevent the spread of this disease and various researches are being conducted to find the cure too. Machine learning (ML) has shown to be useful in battling COVID-19 and various applications have been deployed to comprehend real-world events through the meticulous analysis of data. In this study, we perform a retrospective study of epidemiological parameters to predict the mortality among SARS-CoV-2 patients. The goal of this research is to find important predictive parameters that can indicate the patients who are at the highest risk of death. Supervised ensemble machine learning models were developed that included random forest, catboost, adaboost, gradient boost, extreme gradient boosting and light GBM (Gradient Boosting Machine) for the COVID-19 epidemiology dataset that was obtained from Mexico. Prior to creating the models, Pearson’s co-relation and mutual information analysis between various dependent and independent features were used to establish the strength of the association between features in the dataset. Extreme Gradient Boosting achieved the highest results with an accuracy of 96%.
All Science Journal Classification (ASJC) codes
- Physical and Theoretical Chemistry
- Chemistry (miscellaneous)
- Materials Science(all)
- Energy Engineering and Power Technology
- Artificial Intelligence
- Applied Mathematics