(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Artificial intelligence applications used in the clinical response to COVID-19: A scoping review [1] ['Sean Mann', 'Rand Corporation', 'Santa Monica', 'California', 'United States Of America', 'Carl T. Berdahl', 'Departments Of Medicine', 'Emergency Medicine', 'Cedars-Sinai Medical Center', 'Los Angeles'] Date: 2022-11 Research into using artificial intelligence (AI) in health care is growing and several observers predicted that AI would play a key role in the clinical response to the COVID-19. Many AI models have been proposed though previous reviews have identified only a few applications used in clinical practice. In this study, we aim to (1) identify and characterize AI applications used in the clinical response to COVID-19; (2) examine the timing, location, and extent of their use; (3) examine how they relate to pre-pandemic applications and the U.S. regulatory approval process; and (4) characterize the evidence that is available to support their use. We searched academic and grey literature sources to identify 66 AI applications that performed a wide range of diagnostic, prognostic, and triage functions in the clinical response to COVID-19. Many were deployed early in the pandemic and most were used in the U.S., other high-income countries, or China. While some applications were used to care for hundreds of thousands of patients, others were used to an unknown or limited extent. We found studies supporting the use of 39 applications, though few of these were independent evaluations and we found no clinical trials evaluating any application’s impact on patient health. Due to limited evidence, it is impossible to determine the extent to which the clinical use of AI in the pandemic response has benefited patients overall. Further research is needed, particularly independent evaluations on AI application performance and health impacts in real-world care settings. In this study we describe the use of artificial intelligence (AI) in the clinical response to COVID-19. AI has been variously predicted to play a key role during the pandemic or has been reported to have had little or no impact on patient care. Our findings support a balanced view. We identified 66 applications—specific AI products or tools—used in a variety of ways to diagnose, guide treatment, or prioritize patients during the pandemic response. Many were deployed early in 2020 and most were used in the U.S., other high-income countries, or China. Some were used to care for hundreds of thousands of patients though most were adopted at smaller scales. We found evaluation studies that supported the use of 39 of these applications, though few of these evaluations were written by independent authors, not affiliated with application developers. We found no clinical trials that evaluated the effect of using an AI application on patient health outcomes. Future research is needed to better understand the impact of using AI in clinical care. Funding: This work was funded by the Patient-Centered Outcomes Research Institute (PCORI, https://www.pcori.org/ ) under Contract No. IDIQ-TO#22-RAND-ENG-AOSEPP-04-01-2020. All statements, findings, and conclusions in this publication are solely those of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI). The funders advised on study design. The funders played no role in data collection and analysis, decision to publish, or preparation of the manuscript. Copyright: © 2022 Mann et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. This study seeks to comprehensively identify and characterize AI applications—that is, specific AI-based products or tools—used in the clinical response to COVID-19. To do this, we analyzed a wide range of sources to answer the following four questions, which have not been addressed by other reviews: Previous studies on the role of AI in the clinical response to COVID-19 have reviewed possible use cases [ 10 , 11 ], challenges to deployment [ 12 – 14 ], potential impacts on health equity [ 15 ], and recommendations to improve AI model quality [ 16 , 17 ]. Two separate reviews of academic literature published early in the pandemic identified hundreds of new AI models proposed to address COVID-19, though they did not attempt to identify which models, if any, had been deployed in patient care. These reviews found that all proposed models were flawed due to methodology, potential bias, or poor reporting, and the authors recommended that none be used in clinical practice [ 16 , 18 ]. Nevertheless, some AI applications have been deployed in clinical care for COVID-19 [ 19 – 22 ]. Despite this, no published review has identified more than a handful of applications that have progressed beyond development and testing to be used in clinical practice. Research into artificial intelligence (AI) for health care has grown rapidly, accompanied by a substantial increase in the number of AI applications receiving U.S. Food and Drug Administration (FDA) clearance since 2016.[ 1 , 2 ]. However, adoption has been limited, even in the field of radiology [ 3 ], which has been a particular focus of AI development [ 4 ]. The COVID-19 pandemic has been a proving ground for other health technologies, with an over 30-fold increase in U.S. telehealth usage in the first year of the pandemic [ 5 ] and the global deployment of novel mRNA vaccines [ 6 ]. Despite predictions that AI could play a key role in the clinical response to COVID-19 [ 7 – 9 ], there is little information on the range and extent of AI deployment during the pandemic. We extracted information on evaluation studies using an 14-item data collection tool. A primary reviewer extracted information on each study and a secondary reviewer then reviewed the abstraction. Disagreements were rare and were resolved in discussion between the two reviewers. We provide information extracted on each study together with verbatim text excerpts that served as the basis for extraction in supporting information file S2 Table . We determined an application’s extent of use based on the number of patients which an application was reportedly used to care for in the clinical response to COVID-19. We also counted other measures of reported usage, if these were likely to be within the same order of magnitude as the number of patients cared for. Number of COVID-19 cases or number of CT scans analyzed, for example, were counted, though we did not count the number of data points or CT scan slices analyzed when those were the only measure reported. We categorized the location of application use according to the Organization for Economic Cooperation and Development’s high-income country (HIC) and low or middle-income country (LMICs) categories [ 26 ]. While prior studies have categorized AI applications developed to address COVID-19 in a variety of ways [ 11 , 16 , 18 , 21 ], no consensus framework exists. We developed our framework by identifying categories of applications that shared a common function in an inductive and iterative process. We proceeded in order of category size: we identified a large group of similar applications, assigned them a category label, and then considered the next largest group of applications. Applications that could belong to multiple groups were assigned to the largest category. We identified five functional categories of applications in this way and assigned the remaining applications to a catch-all ‘Other’ category. We extracted information on AI applications using a 65-item data collection form. All documents found on an application were reviewed during this data extraction process. For each application we also examined company websites and conducted targeted searches of Google Scholar, examining the first 30 results for each search, to obtain publicly available evaluation studies. We searched both peer-reviewed articles as well as a wide variety of grey literature documents, including academic pre-prints, conference abstracts, company web pages, and regulatory documents, to identify evaluation studies on applications. We only counted a document as an evaluation study if it reported information on evaluation results, study population, and outcome measurement methods. To be included in our review, an application had to meet the following three criteria: Following identification of an AI application that was potentially used in the clinical response to COVID-19, we conducted targeted google searches using the application name and/or developer to obtain additional information and confirm its use. A member of the study team read the full text of each document that passed initial screening to identify AI applications used in the clinical response to COVID-19 response. Additional documents were obtained by following chains of relevant citations and hyperlinks from the initial set of documents as well as in the first 5 pages of results from two google searches: ‘FDA-approved artificial intelligence COVID’ and ‘artificial intelligence clinical adoption COVID’. To ensure consistency in screening decisions, we used dual-review methods and assessed inter-reviewer reliability. This began with three members of the project team each independently examining a random sample of approximately 10% of search results, followed by discussion of discrepancies and refinement of screening criteria. We then used these finalized criteria to conduct single-screening of the remaining search results, with random dual-review checks of 25% of the remaining documents to ensure continued consistency in the screening process. Disagreements in the two reviewers’ screening decisions were resolved by a third project team member, with discussion of edge cases conducted on an as-needed basis. We screened academic articles based on title and abstract as well as clinical trial records based on summary, description, and intervention fields. English-language documents that discussed AI applications used in the response to COVID-19 were screened as eligible for full text review. We also screened in documents on AI and equity as part of the broader research effort. We used structured search terms under the guidance of a medical reference librarian to identify academic review articles in PubMed, Web of Science, and IEEE Xplore Digital Library. We then adapted search terms to facilitate searches for gray literature documents, including news articles, clinical trials, and government documents on Proquest US Newsstream, Academic Search Complete, ClinicalTrials.gov, and the U.S. Food and Drug Administration (FDA) document library. We searched for documents that contained both an AI- and a COVID-19-related term. Additional searches were conducted to obtain documents relating to AI and health equity; while these searches were designed for use in a broader research effort, all results were screened for relevance to this study as well. Details on all searches are provided in supporting information file S1 Appendix . We searched several databases in December 2021 to identify documents describing the use of AI in the COVID-19 response. The search was limited to material that became available on or after January 1, 2020, the day after China first reported the possibility of a new virus to the World Health Organization [ 25 ]. Stakeholder consultation interviews were determined to be exempt by the RAND Corporation Human Subjects Protection Committee (HSPC ID# 2021-N0625), which serves as RAND’s IRB. We obtained informed consent from stakeholder participants orally at the outset of all interviews. The interview protocol, which was provided to stakeholders and also covered topics related to AI and health equity as part of a broader research study, is available in S1 Appendix . At the outset of the study process, we engaged with a diverse set of health care stakeholders: a patient advocate, two clinicians, one health system representative, one insurer representative, one public policymaker, one public health official, one industry representative, and one researcher. We conducted separate semi-structured interviews with each of these stakeholders to elicit their recommendations on key questions, study design, and documents to review. We also asked stakeholders for information on any AI applications used in the COVID-19 response. Stakeholder inputs provided a preliminary set of documents and applications to consider for inclusion in our review. For the purposes of this study, we considered an application to be clinical in nature if it was used in efforts to improve patient health at the individual level as part of patient evaluation, clinical decision-making, or treatment delivery. The topic of this article—the use of AI in the clinical response to COVID-19—was established by the sponsor prior to the start of the study as part of a broader project examining AI, COVID-19, and health equity. Stakeholder consultation, initial document searches, and document screening were undertaken as part of this broader project. Our approach consisted of four steps: (1) consulting stakeholders; (2) document search and screening; (3) reviewing documents to identify AI applications; and (4) extracting information on AI applications and evaluation studies. We performed a scoping review of academic and grey literature to identify a comprehensive set of literature that would answer the four key questions listed above. According to published guidance, a scoping view approach is well suited to exploratory analysis of broad topics concerning the “extent, range, and nature of research activity” in a field [ 23 ]. We have followed the reporting guidelines contained in the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-Scr) [ 24 ]. Results Our searches of academic and gray literature databases yielded a total of 1,880 unique documents. After applying inclusion criteria at the level of the title and abstract, we screened 634 documents of potential interest to include in our review. The results of the document search and screening process are shown in Fig 1. We identified 66 AI applications used in the COVID-19 response and grouped them into 6 functional categories. Information on individual applications is provided in the supporting information file S1 Table. A summary is provided in Table 1. Data inputs, context of use, and proposed benefits by application category Applications within the same functional category tended to share similar data inputs, predicted variables, users, settings, and proposed benefits, as discussed below. Lung evaluation Lung evaluation applications assessed X-ray images (n = 7), CT images (n = 12), or both (n = 1) to evaluate the lungs of patients with suspected or confirmed COVID-19. These applications assessed one or more of the following: presence of pneumonia (n = 10), pneumothorax (n = 2), other lung abnormalities associated with COVID-19 (n = 12), or an overall COVID-19 risk score (n = 4). All lung evaluation applications were used by clinicians in hospitals (n = 20), with some x-ray applications also used in mobile screening vans (n = 2), quarantine centers (n = 2), or border facilities and prisons (n = 1). The proposed benefits of these applications included informing patient treatment (n = 18), informing diagnosis of COVID-19 (n = 16), speeding up patient evaluation (n = 14), assisting triage (n = 13), conserving staffing resources (n = 8), reducing disease transmission (n = 7), or replacing unavailable RT-PCR-based screening (n = 5). Symptom checkers Symptom checkers analyzed patient-reported demographics, risk factors, and/or symptoms to provide a personalized COVID-19 risk assessment or care recommendations. Six symptom checkers interpreted user-entered free text inputs as part of this assessment. Symptom checkers were accessed via online website (n = 10), smartphone app (n = 4), mobile texting (n = 1), or voice assistant (n = 1). Most (n = 8) provided recommendations for when patients should seek further care, including 2 which offered to directly connect high-risk patients to a clinician for telehealth consultation. One symptom checker assessed the likelihood of post-acute sequelae of SARS-COV-2 infection (PASC), or “long COVID”, in addition to active COVID-19 infection. Patient deterioration Patient deterioration applications monitored COVID-19 patients for potential clinical deterioration to inform care escalation decisions. All sought to detect changes in patient vital signs. Three applications used data from wearable monitors. One application also predicted risk of respiratory failure and hemodynamic instability. One application estimated respiratory rate, heart rate, and movement from signals received by an under-mattress sensor. Another application estimated breathing patterns, movement, and sleep stages from an in-room wireless transmitter/receiver. Four applications used patient demographics and other electronic health record (EHR) data to inform their assessment. Clinicians used these applications in hospitals (n = 6), assisted living facilities (n = 1), or to remotely monitor patients at home (n = 3). The proposed benefits of these applications included informing patient treatment (n = 9), improving patient safety (n = 9), reducing disease transmission (n = 7), and conserving staffing resources (n = 3). Infection likelihood We found several different types of applications that assessed likelihood of COVID-19 infection. Two applications predicted likelihood of COVID-19 infection based on volatile organic compounds present in an individual’s breath. One was used by professional screening personnel in public settings and the other at border facilities. Their proposed benefits included informing COVID-19 diagnosis, speeding up patient evaluation, and lowering costs. Two applications predicted RT-PCR results for individual test samples to optimize pooled testing. One application based its prediction on geographically aggregated data while the other used patient demographics and EHR data including free text clinician notes. Their proposed benefit was to conserve testing resources. Two applications were used in clinical settings to triage patients for testing, one based on geographically aggregated data and the other on patient demographics, vital signs, and blood test results. Their proposed benefits included reducing disease transmission, speeding up patient evaluation, conserving staffing resources, and improving patient safety. One application predicted the likelihood of COVID-19 infection by detecting a luminescent signal on a rapid antigen test strip. This application was designed for use by health care professionals and its proposed benefit was reducing errors in interpreting test results. One application predicted the likelihood of COVID-19 infection using voice audio from calls to emergency services to assist in triage. Disease severity Applications that predicted a patient’s risk of severe COVID-19 were used to inform treatment (n = 3), prioritize patients for evaluation (n = 2), testing (n = 2), or vaccination (n = 1). All these applications based their predictions on patient demographics and medical records. One also used patient vital signs and blood test results. Four were used by clinicians in hospitals, outpatient clinics, or telehealth settings. One was used by professional call center personnel contacting patients by phone. Their proposed benefits included assisting triage (n = 3) and conserving testing resources (n = 2). Other Other applications were used to perform a variety of functions in the COVID-19 response, including assisting image acquisition (n = 4), detecting immune response (n = 2), predicting response to treatment (n = 2), and performing other functions (n = 4). Two image acquisition applications used cameras to detect anatomical landmarks and adjust patient position for CT scanning. One application evaluated image quality and predicted optimal ultrasound device manipulations to guide non-specialist clinicians conducting echocardiograms. One application generated Spanish-language audio instructions from previously translated text for patients undergoing chest x-rays. Two applications detected signs of recent or prior COVID-19 infection based on detecting either antibodies or T-cells in patient blood samples, which could inform the assessment of PASC [27]. One application that predicted patient survival in response to treatment was used to prioritize COVID-19 patients to receive lung transplants. Another application was used to predict patient response to hydroxychloroquine. One application analyzed chest x-rays to evaluate endotracheal breathing tube position. One application used echocardiography to estimate left ventricular ejection fraction for patients with potentially degraded cardiac function as a result of acute COVID-19 infection or PASC. One application identified barriers to hospital discharge using patient demographics and EHR records. One application provided a geographically aggregated measure of health disparities which were used to prioritize patients to receive a scarce medication, remdesvir. AI methods used in applications Twenty-six applications used neural networks, mostly to interpret images for lung evaluation (n = 18), cardiac evaluation (n = 1), evaluation of breathing tube position (n = 1), or to assist in imaging acquisition (n = 2). Neural networks were also used to interpret wireless signals (n = 1), unstructured text (n = 2), or to generate voice audio from text (n = 1). Some neural network applications used image overlays to aid interpretation (n = 11), and one application generated a text explanation of its analysis using the GPT-3 natural language model [28]. Another six applications used advanced tree-based methods, specifically gradient-boosted trees (n = 5) and random forest models (n = 1), to analyze blood test results (n = 4), vital signs (n = 3), or patient demographics and electronic health records (n = 4). These predicted disease severity (n = 3), infection likelihood (n = 1), immune response (n = 1), or response to treatment (n = 1). Three of these applications used Shapley’s Additive Explanations (SHAP) values to improve interpretability. Seven applications used traditional supervised machine learning methods, including logistic regression (n = 4), unspecified regression (n = 2), Cox proportional hazards (n = 1), or linear discriminant analysis (n = 1). These applications based their predictions on patient demographics and EHR records (n = 5), blood test results (n = 2), vital signs (n = 2), or breath profiles (n = 1). Two applications used unsupervised learning, including factor analysis and principal components analysis. We did not find information on type of AI used for 25 applications, though these were described as using AI or ML. This included most symptom checkers (11 of 12), patient deterioration (6 of 9), and infection likelihood applications (5 of 8). Three symptom checkers and one lung evaluation application were described as continuously learning from interpretation of new data. Three lung evaluation applications used AI-based natural language processing to automate labeling of training or validation datasets. Deployment date, location, and extent of use We were able to determine the date (n = 64) and location (n = 62) of the documented deployment in the COVID-19 response for most applications (Fig 2). Most applications were deployed early in the pandemic, from January-March 2020 (n = 27) or April-June 2020 (n = 16). Fewer were first deployed from July-December 2020 (n = 10) or in 2021 (n = 11). PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 2. Date and country of first confirmed deployment in the COVID-19 response, by application category. Note: Dates have been jittered up to 12 days within the same month. https://doi.org/10.1371/journal.pdig.0000132.g002 Twenty-five applications’ first confirmed deployment was in the U.S., 12 in China, 19 in other high-income countries (HICs), and 6 in other low and middle-income countries (LMICs). All applications that first deployed in China did so from January-March 2020. Initial deployments in the U.S. and other countries were more spread out over time. Most lung evaluation (13 of 20) and patient deterioration applications (5 of 9) were deployed from January-March 2020. Most symptom checkers were deployed between January-June 2020 (7 of 12). Half of infection likelihood applications were deployed in 2021 (4 out of 8). Most applications in the ‘other’ category were deployed between January-June 2020 (9 out of 12). We found information on the scale of use for the majority of applications (n = 45) and country of use for most applications (n = 62) (Fig 3). Applications deployed in China had the highest usage, with half (6 of 12) used over 10,000 times and a quarter (3 of 12) used over 100,000 times. Lung evaluation applications tended to report higher usage than other application types. Patient deterioration applications tended to report lower usage or did not report usage at all. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 3. Extent and location of use in the COVID-19 response, by application category. Note: Scale of use reflects numbers of patients or related proxy measures. 10 applications were used in multiple country categories and are represented by multiple bubbles. We were unable to find exact country or scale of use for 3 online symptom checkers; these are represented by ‘unknown’ bubbles assigned to the country category where they were developed. https://doi.org/10.1371/journal.pdig.0000132.g003 Most applications were used in either the U.S. (n = 31) or China (n = 12), with applications also used in other HICs (n = 27) or LMICs (n = 7). Applications used in China were almost entirely for lung evaluation (9 of 12) as were those used in other LMICs (5 of 7). Applications in the U.S. included a large number focused on patient deterioration (8 of 31) and very few lung evaluations (2 of 31) and disease severity applications (1 of 31). The great majority of applications in the ‘Other’ category were used only in the U.S. (10 of 12), including all applications used to predict response to treatment, detect immune response, or assess breathing tube position. Most disease severity applications were used in other HICs (4 of 5). FDA review and application predecessors As of December 2021, four applications had received FDA emergency use authorizations (EUAs), 11 applications had been cleared for use under the FDA 510(k) pathway for products that are “substantially equivalent” to already-approved devices [29], and one had been cleared as a novel device under the De Novo pathway. FDA-cleared or -authorized applications included most in the patient deterioration category (6 of 9), many in the ‘other’ category (6 of 12), and a small number of infection likelihood (1 of 8) and lung evaluation (2 of 20) applications. The U.S. Center for Medicare and Medicaid Services designated 1 application eligible for product-specific Medicare reimbursement in 2021 [30]. Seventeen applications were already in use prior to the pandemic and were first deployed in the COVID-19 response without modification. These included most of the patient deterioration applications (7 of 9), a few lung evaluation applications (4 of 20), and several applications in the ‘Other’ category (6 of 14). Four of these applications were later modified to specifically address COVID-19, including three lung evaluation applications and one application in the ‘Other’ category. An additional 17 applications, including 5 lung evaluation and 10 symptom checkers, were adapted to address COVID-19 from applications used prior to the pandemic. Of the remaining 32 new applications not directly related to a pre-pandemic product, at least three have additionally been used to address health conditions other than COVID-19, all in the U.S. These three saw their deployment accelerated due to the pandemic, including an application that assesses endotracheal breathing tube position deployed under blanket FDA guidance expanding use of applications without review, an application that assists in image acquisition deployed under an expedited 510(k) application, and a patient deterioration application rapidly deployed by a health system to address the pandemic. [END] --- [1] Url: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000132 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/