(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Severity and mortality prediction models to triage Indian COVID-19 patients [1] ['Samarth Bhatia', 'Department Of Chemical Engineering', 'Indian Institute Of Technology', 'Delhi', 'Hauz Khas', 'New Delhi', 'Yukti Makhija', 'Department Of Biochemical Engineering', 'Biotechnology', 'Sneha Jayaswal'] Date: 2022-08 Abstract As the second wave in India mitigates, COVID-19 has now infected about 29 million patients countrywide, leading to more than 350 thousand people dead. As the infections surged, the strain on the medical infrastructure in the country became apparent. While the country vaccinates its population, opening up the economy may lead to an increase in infection rates. In this scenario, it is essential to effectively utilize the limited hospital resources by an informed patient triaging system based on clinical parameters. Here, we present two interpretable machine learning models predicting the clinical outcomes, severity, and mortality, of the patients based on routine non-invasive surveillance of blood parameters from one of the largest cohorts of Indian patients at the day of admission. Patient severity and mortality prediction models achieved 86.3% and 88.06% accuracy, respectively, with an AUC-ROC of 0.91 and 0.92. We have integrated both the models in a user-friendly web app calculator, https://triage-COVID-19.herokuapp.com/, to showcase the potential deployment of such efforts at scale. Author summary As the medical system in India struggles to cope with more than 1.5 million active cases, with a total number of patients crossing 30 million, it is essential to develop patient triage models for effective utilization of medical resources. Here, we built cross-validated machine learning models using data from one of the largest cohorts of Covid-19 patients from India to categorize patients based on the severity of infection and eventual mortality. Using routine clinical parameters measured from patient blood we were able to predict with about 90% accuracy the progression of disease in an individual at the time of admission. Our model is available as a web application https://triage-covid-19.herokuapp.com/ and is easily accessible and deployable. Citation: Bhatia S, Makhija Y, Jayaswal S, Singh S, Malik PS, Venigalla SK, et al. (2022) Severity and mortality prediction models to triage Indian COVID-19 patients. PLOS Digit Health 1(3): e0000020. https://doi.org/10.1371/journal.pdig.0000020 Editor: Martin G. Frasch, University of Washington, UNITED STATES Received: June 24, 2021; Accepted: January 31, 2022; Published: March 9, 2022 Copyright: © 2022 Bhatia et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: We have uploaded the de-identified data to a GitHub repository: https://github.com/yuktimakhija/COVID-19-Patient-Triaging. Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Competing interests: The authors have declared that no competing interests exist. 1. Introduction Translational science is a rapidly growing field with immediate potential in direct clinical applications. This includes the development of computational models that analyze Electronic Health Records (EHRs) and aid us in interpreting the complex biological associations between clinical measurements and patient outcomes. Machine Learning is an extremely powerful tool deployed by translational scientists to recognize patterns and identify features from medical data that correlate with clinical outcomes. The predictions obtained from such machine learning models may assist in clinical decision-making. This can enable us to automate certain stages of diagnosis, especially when during a scarcity of medical resources such as trained medical professionals or intensive care units (ICUs). Globally, several models have been proposed by researchers to tackle the problem of triaging COVID-19 patients to budget for, allocate and effectively manage appropriate medical resources, such as by D Ellinghaus et al. [1] and by Yuan Hou et al. [2]. These studies found that the susceptibility and mortality rates are widely variable across countries. However, only a handful of these models have been trained on datasets populations in the developing world, such as India. To the best of our knowledge, only two such models have been trained and tested on Indian populations. First, a model based on a Random Forests classifier predicting ICU admissions using features such as age, symptoms at diagnosis, number of comorbidities, chest X-ray, SpO2 concentration, ANC/ALC ratio, CRP, and Serum ferritin concentrations [3]. Second, a set of models forecast the number of COVID-19 cases in India using simple SEIR mathematical models and also using LSTMs [4], [5]. 3. Discussion and conclusion We were successfully able to make two machine learning models to triage COVID-19 patients in India into prediction categories like alive versus deceased and severe versus non-severe. This was done using XGBoost (12) models that use gradient boosting in decision trees (the gbtree booster of XGBoost was internally used). These models can be summarized through sample decision trees (Fig 5) to inform a clinical decision support system. Note that XGBoost is an ensemble model and creates multiple weak classifiers which work strongly when together. In order to make the results of our model available for testing, we made an online calculator web app that can be used to determine the mortality and severity using our models at: https://triage-COVID-19.herokuapp.com/. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 5. Sample decision trees of mortality (left) and severity (right) prediction models to test clinical applicability. https://doi.org/10.1371/journal.pdig.0000020.g005 In the mortality model, we see that most of the evaluation parameters, like test accuracy, F-score, recall, precision, and AUC-ROC scores are less than those for the severity model. This can be because, for the severity model, we have used certain biomarkers that have been shown to be affected by COVID-19. These include high sensitivity C-Reactive Protein (hs-CRP), ferritin, D-D dimer, and Lactate Dehydrogenase (LDH). However, we could not use LDH because of data limitations as described in the section on building the severity model. Even so, we found in the above section that ferritin does not contribute as much to the severity model’s output as hs-CRP and D-D dimer do. This could imply there is a more direct correlation between COVID-19 and hs-CRP and D-D dimer than ferritin. Additionally, in the ROC curves, the severity model can achieve around 60% sensitivity (TPR) with a very low FPR, whereas the mortality model can only achieve up to about 50% sensitivity for the same low FPR. In the precision-recall curves, we see that the mortality model can achieve a recall of 0.2 without predicting any false positives, whereas severity can achieve a recall of 0.4. 4. Materials and methods 4.1 Confidentiality statement and ethics The study protocol was reviewed and approved by Institute ethics committees at both the participating centres through Institute ethics committee File No. IEC/320/325 to SS. Consent waiver was granted given the retrospective nature of analysis and emergent nature of the pandemic. Confidentiality was maintained by the de-identification of data. All analysis was performed on de-identified data. 4.2 Data collection The data for this study was collected from one of the largest dedicated COVID-19 centers in Northern India, Sardar Vallabhbhai Patel COVID Hospital and PM CARES COVID Care Hospitals. In these hospitals, a series of blood tests were performed for all the patients with confirmed infection at the time of admission. The patients were diagnosed using a Rapid Antigen Test or RT-PCR testing of a nasal/throat swab sample. The data for 815 patients were collected between 13th July and 31st December 2020. The clinicians followed the guidelines provided by the MOHFW, Govt. of India[9], to classify 390, 160, 84, 181 patients into the categories mild, moderate, severe, and dead based on symptoms on arrival. These results were used to identify the primary biomarkers that control the risk of developing severe infections and survival chances. 4.3 Data preprocessing Data preprocessing involved removing missing values by deleting the patients or features entirely if they are available for significantly fewer patients. We avoided simple imputation, i.e., replacing missing values by the mean, because the size of the dataset was not appreciable enough to guess values accurately. Imputing the missing values would have resulted in a low variance, and decreasing our data’s variance directly implies adding a bias to our model. Other than this, we converted all categorical variables into one-hot encoded form, and we made binary variables for signs/symptoms and comorbidities of each patient. The comorbidities were classified into broad categories: cardiac disease, chronic liver disease, hypertension, diabetes, chronic kidney disease, lung disease, morbid obesity, and hypothyroidism. The dataset obtained after completing data-preprocessing contained 600 patients, out of which 170 were deceased, 250 had experienced mild symptoms, 99 moderate, and 81 severe with 33 clinical parameters. 4.4 Clustering We had performed clustering of patients based on their collected clinical parameters using the KPrototypes algorithm to check for the presence of any clinical bias. We have summarized the results in Supplementary Material (Fig A in S1 Text). 4.5 Modelling strategy 4.5.1 Mortality prediction model. All the features, including their gross statistics, used for training mortality prediction models can be found in Supplementary Material (Table A in S1 Text). The number of deceased patients was less than those who had recovered (170:430); this formed a skewed data distribution. Standard machine learning algorithms perform poorly on such datasets as they tend to be biased towards the majority class. This decreases the prediction accuracy of the minority class [10]. For this reason, we decided to use ensemble models, which are multiple classifier methods, i.e., these learning machines combine the decision of multiple base classifiers to reach the final prediction. A widely-used class of ensemble models is boosting algorithms, in which the final classifier is built through the sequential addition of multiple weak classifiers. The overall performance of the model increases with the addition of each classifier. We did an extensive survey of the different CART (Classification and Regression Trees) models and decided to implement AdaBoost(16), XGBoost(12), and CatBoost(15) because of their proof of performance. In boosting algorithms, a 1:1 ratio of the classes in the data is recognized as ideal. We used random undersampling for the majority class, i.e., recovered patients, to reduce the data imbalance. The new dataset contained the information of 170 deceased patients and 205 recovered patients. We ensured that the number of mild, moderate, and severe patients were equally represented (Mild-55, Moderate-55, Severe-55) in the dataset to minimize bias in the model. Following this, we split this dataset into training and validation sets containing 302 and 73 patients, respectively. The splitting was performed using stratified sampling to maintain the ratio of mild, moderate, severe, and deceased patients in both sets. Once the training dataset had been finalized, we started with the training of the model. We implement Repeated Stratified 5-fold cross-validation using the GridSearchCV function from the sklearn(14) python package. While iterating with a wide range of hyperparameters during cross-validation, we have optimized the model for the best accuracy and re-fitted it for the best F-Score. The hyperparameter values of the model obtained after cross-validation are listed in Table 3. Using colsample_bytree enables us to induce randomness in the training examples and make our model robust from noise, and varying max_depth, min_child_weight, gamma, and alpha enables us to reduce and remove any overfitting. PPT PowerPoint slide PNG larger image TIFF original image Download: Table 3. List of Hyperparameters used in the Models. https://doi.org/10.1371/journal.pdig.0000020.t003 4.5.2 Severity prediction model. All the features, including their gross statistics, used for training mortality prediction models can be in Supplementary Material (Table B in S1 Text). The objective of this model was to predict the risk of developing severe infection. We clubbed mild and moderate patients as non-severe, and severe and dead patients were grouped as severe. Many researchers have found that COVID-19 causes a change in these four critical biomarkers: D-D Dimer, High Sensitivity C-Reactive Protein(hs C-RP), Lactate Dehydrogenase (LDH), and ferritin. However, these were not measured for all patients. We had 396, 390, 305, 398 patients for those parameters, respectively. We calculated the number of unique patients with different combinations of three of these parameters and found the maximum to be 331 patients with values for D-Dimer, hs-CRP, and ferritin as the optimal combination keeping in mind how the model performance depends on the quantity of data and number of parameters. Following this, we split these 331 patients into training and validation sets containing 264 and 67 patients, respectively. The choice of model was the same as in the mortality prediction model mentioned above. After hyperparameter tuning and cross-validation as mentioned in the mortality model above, the hyperparameters of the best model are mentioned in Table 3. 4.5.3 Evaluation methodology. We evaluated the models based on different parameters, namely Accuracy, Recall, Precision, and F-Score, along with the ROC and Precision-Recall curves. These are defined in terms of the number of True Positives(TP), False Positive(FP), True Negative(TN), and False Negative(FN) for a class as: The ROC (Receiver Operating Characteristics) curve can be made by plotting TPR (Recall) vs. FPR(1—specificity(or TNR)). It is a measure of how well our model can differentiate between 2 classes. The AUC-ROC is the Area under the ROC curve. The methods accuracy_score(), recall_score(), precision_score() and f1_score() of the sklearn.metrics module were used to evaluate the respective parameters and the plot_roc_curve() and precision_recall_curve() to plot the ROC and P-R curves and get the AUC value. For the mortality model, the positive class was chosen to be deceased(dead) patients and the negative class to be alive (mild + moderate + severe) patients. Following the same pattern, in the severity model, the positive class was chosen to be severe(severe + dead) patients, whereas the negative class was chosen to be non-severe (mild + moderate) patients. The SHAP value plots are not part of the model itself; rather, SHAP uses an external model to vary the value of every feature and observe how much the output is affected. The SHAP value plots tell us how these variations affect the model’s output. The mean |SHAP| value plot gives us information about which features are the most important in our model and have the most impact. Please note that we can use the precision-recall curve here since we are more concerned with the positive class and having fewer False Negatives than False Positives (as patients whose mortality/severity goes undetected on being classified as non-severe/alive is more harmful to the society than a patient which was not severe being classified as severe/dead) [11]. Supporting information S1 Text. Fig A: The points are colored according to the clustering labels (learned from data), and the point shape represents the severity status of the patients. Table A: Comparison of features in deceased and alive patients in the mortality prediction model. Table B: Comparison of features in severe and non-severe patients in the severity prediction model. Fig B: Distribution of features with high predictive power. Table C: Evaluation of reduced mortality model. Table D: Evaluation of reduced mortality model https://doi.org/10.1371/journal.pdig.0000020.s001 (DOCX) Acknowledgments We would like to acknowledge the nursing staff and medical professionals who have tirelessly worked to alleviate the suffering caused by this pandemic and facilitated the collection of good-quality data for this publication. [END] --- [1] Url: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000020 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/