(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Higher education responses to COVID-19 in the United States: Evidence for the impacts of university policy [1] ['Brennan Klein', 'Network Science Institute', 'Northeastern University', 'Boston', 'United States Of America', 'Laboratory For The Modeling Of Biological', 'Socio-Technical Systems', 'Massachusetts', 'Nicholas Generous', 'Biosecurity'] Date: 2022-08 Abstract With a dataset of testing and case counts from over 1,400 institutions of higher education (IHEs) in the United States, we analyze the number of infections and deaths from SARS-CoV-2 in the counties surrounding these IHEs during the Fall 2020 semester (August to December, 2020). We find that counties with IHEs that remained primarily online experienced fewer cases and deaths during the Fall 2020 semester; whereas before and after the semester, these two groups had almost identical COVID-19 incidence. Additionally, we see fewer cases and deaths in counties with IHEs that reported conducting any on-campus testing compared to those that reported none. To perform these two comparisons, we used a matching procedure designed to create well-balanced groups of counties that are aligned as much as possible along age, race, income, population, and urban/rural categories—demographic variables that have been shown to be correlated with COVID-19 outcomes. We conclude with a case study of IHEs in Massachusetts—a state with especially high detail in our dataset—which further highlights the importance of IHE-affiliated testing for the broader community. The results in this work suggest that campus testing can itself be thought of as a mitigation policy and that allocating additional resources to IHEs to support efforts to regularly test students and staff would be beneficial to mitigating the spread of COVID-19 in a pre-vaccine environment. Author summary The ongoing COVID-19 pandemic has upended personal, public, and institutional life and has forced many to make decisions with limited data on how to best protect themselves and their communities. In particular, institutes of higher education (IHEs) have had to make difficult choices regarding campus COVID-19 policy without extensive data to inform their decisions. To better understand the relationship between IHE policy and COVID-19 mitigation, we collected data on testing, cases, and campus policy from over 1,400 IHEs in the United States and analyzed the number of COVID-19 infections and deaths in the counties surrounding these IHEs. Our study found that counties with IHEs that remained primarily online experienced fewer cases and deaths during the Fall 2020 semester—controlling for age, race, income, population, and urban/rural designation. Among counties with IHEs that did return in-person, we see fewer deaths in counties with IHEs that reported conducting any on-campus testing compared to those that reported none. Our study suggests that campus testing can be seen as another useful mitigation policy and that allocating additional resources to IHEs to support efforts to regularly test students and staff would be beneficial to controlling the spread of COVID-19 in the general population. Citation: Klein B, Generous N, Chinazzi M, Bhadricha Z, Gunashekar R, Kori P, et al. (2022) Higher education responses to COVID-19 in the United States: Evidence for the impacts of university policy. PLOS Digit Health 1(6): e0000065. https://doi.org/10.1371/journal.pdig.0000065 Editor: Yuan Lai, Tsinghua University, CHINA Received: December 5, 2021; Accepted: May 18, 2022; Published: June 23, 2022 Copyright: © 2022 Klein et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The dataset and Python code to reproduce the analyses and construction of the database is available at https://github.com/jkbren/campus-covid. Funding: A.V. and M.C. acknowledge support from COVID Supplement CDC-HHS-6U01IP001137-01 and Cooperative Agreement no. NU38OT000297 from the Council of State and Territorial Epidemiologists (CSTE). A.V. acknowledges support from the Chleck Foundation. N.G. acknowledges LA-UR-21-25928. The findings and conclusions in this study are those of the authors and do not necessarily represent the official position of the funding agencies, the National Institutes of Health, or U.S. Department of Health and Human Services. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: S.V.S. holds unexercised options in Iliad Biotechnologies. This entity provided no financial support associated with this research, did not have a role in the design of this study, and did not have any role during its execution, analyses, interpretation of the data and/or decision to submit. Introduction Younger adults account for a large share of SARS-CoV-2 infections in the United States, but they are less likely to become hospitalized and/or die after becoming infected [1–5]. Mitigating transmission among this population could have a substantial impact on the trajectory of the COVID-19 pandemic [2]; younger adults typically have more daily contacts with others [6–8], are less likely to practice COVID-19 mitigation behaviors [9, 10], are more likely to have have jobs in offices or settings with more contacts with colleagues [11], and travel at higher rates [12–14]. Additionally, in the United States, over 19.6 million people attend institutes of higher education (IHEs; i.e., colleges, universities, trade schools, etc.) [15], where students often live in highly clustered housing (e.g. dorms), attend in-person classes and events, and gather for parties, sporting events, and other high-attendance events. Because of this, the COVID-19 pandemic presented a particular challenge for IHEs during the Fall 2020 semester [16–21]. On the one hand, bringing students back for on-campus and in-person education introduced the risk that an IHE would contribute to or exacerbate large regional outbreaks [22–30]; on the other hand, postponing students’ return to campus may bring economic or social hardship to the communities in which the IHEs are embedded [31–34], since IHEs are often large sources of employment for counties across the United States. As a result, IHEs instituted a variety of “reopening” strategies during the Fall 2020 semester [35–44]. Among IHEs that brought students and employees back to campus—either primarily in person or in a “hybrid” manner—we see different approaches to regularly conducting (and reporting) COVID-19 diagnostic testing for students, faculty, and staff throughout the semester. Most of these policies were designed to minimize spread within the campus population as well as between the IHE and the broader community. These policies include testing of asymptomatic students and staff, isolating infectious students, quarantining those who were potentially exposed through contact tracing, extensive cleaning, ventilation, mask requirements, daily self-reported health assessments, temperature checks, and more [45, 46]. As with much of the COVID-19 pandemic [47], these policies were often instituted in a heterogeneous manner, with varying levels of severity [37], which makes studying their effects both important and challenging. Studying the various differences between these policies is made even more difficult because of the lack of a centralized data source and standardized reporting style. On top of that, counties with IHEs represent a wide range of demographics (age, income, race, etc.) [48], which must be accounted for when comparing any policies, since these factors have known associations with an individual’s likelihood of hospitalization or death [49, 50]. Many IHEs developed and maintained “COVID dashboards” [51] that update the campus community about the number of COVID-19 cases reported/detected on campus and, if applicable, the number of diagnostic tests conducted through the IHE. Here, we introduce a dataset of testing and case counts from over 1,400 IHEs in the United States (Fig 1), and we use this dataset to isolate and quantify the impact that various IHE-level policies may have on the surrounding communities during the Fall 2020 semester (August to December, 2020). After a matched analysis of statistically similar counties, we show that counties with IHEs that reopened for primarily in-person education had a higher number of cases and deaths than counties with IHEs that did not. Among IHEs that did allow students back on campus, we see fewer cases and deaths on average if the county contains IHEs that conduct on-campus COVID testing. We further examine this result by focusing on data from IHEs in Massachusetts, where we find that cities with IHEs that test more also have fewer average cases per capita. This pattern holds in spite of the number of cases detected among members of the campus community. These results point to a benefit of large-scale, asymptomatic testing of the campus community (students, faculty, staff, etc.), which can be especially important in regions without (or with fewer) local mitigation policies in place. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 1. Description of the Campus COVID Dataset. Map of the 1,448 institutes of higher education included in the Campus COVID Dataset. The dataset includes semester-long time series for 971 institutes of higher education (see S1 Text for several examples), in addition to 477 that have cumulative data only (i.e. one sum for the total testing and/or case counts for the Fall 2020 semester). County and state boundary maps downloaded from the United States Census TIGER/Line Shapefiles [52]. https://doi.org/10.1371/journal.pdig.0000065.g001 Discussion The COVID-19 pandemic required governments and organizations to implement a variety of non-pharmaceutical interventions (NPIs) often without a thorough understanding of their effectiveness. Policy makers had to make difficult decisions about which policies to prioritize. While a body of literature has emerged since the beginning of the pandemic about measuring the effectiveness of NPIs [62–65], to date there have been no studies that attempt to measure the effectiveness of campus testing systematically nation wide. This study sheds light on this topic by directly measuring the impact of campus testing on county level COVID-19 outcomes. We collected data from 1,448 colleges and universities across the United States, recording the number of tests and cases reported during the Fall 2020 semester; by combining this data with standardized information about each school’s reopening plan, we compared differences in counties’ COVID-19 cases and deaths, while controlling for a number of demographic variables. We used an entropy minimization approach to create two groups of counties that were as similar to demographic variables of interest (e.g., age, income, ethnicity) as possible in order to minimize confounding. The resulting groups had a similar number of counties per group, were spatially heterogeneous, and did not ultimately include counties from regions that experienced early surges in March, 2020 (e.g., counties in New York City, etc; see S1 Text), which could have confounding effects. When looking at county COVID-19 outcomes, our results shows that COVID-19 outcomes were worse in counties with IHEs that report no testing and in counties where IHEs returned to primarily in-person instruction during the Fall 2020 semester. These findings support the CDC recommendation to implement universal entry screening before the beginning of each semester and serial screening testing when capacity is sufficient [66] and are in line with smaller scale, preliminary results from other studies [67, 68]. While this study does not look at optimal testing strategies, it offers evidence for the protective effect of campus testing in any form and reopening status on county COVID-19 outcomes. The COVID-19 pandemic highlighted the importance of data standardization for understanding the impact of the virus but also in to inform response, resource allocation, and policy. While much attention has been given to this topic for data reported by healthcare and public health organizations, little attention has been given for COVID-19 case and testing data reported by IHEs. A significant portion of the effort undertaken by this study was spent compiling and standardizing the data across IHEs nationwide. In the cases where IHEs did report campus testing data, the ease of access varied widely and oftentimes different metrics for cases and testing were reported out. For example, some IHEs would report only active cases, cumulative cases, or number of isolated individuals. Similarly, sometimes there would be no distinction between types of test given or temporal information on when the test was given. In their campus testing guidance [66], the CDC should also include recommendations on data standards and reporting formats. While COVID-19 cases in the United States are lower than the peak in January 2021 and 2022, concerns remain around lingering outbreaks caused by new variants emerging, ongoing transmission in the rest of the world, vaccine hesitancy, and the possibility of waning effectiveness of the current vaccines [69, 70]. In regions like the Mountain West and South at the time of writing, vaccination rates remain disproportionately low among younger adults and the general population when compared to nation wide averages [71]. States in these same regions are also disproportionately represented among the states with the lowest IHE testing in our data set. Heterogeneity in vaccine uptake—and policy response broadly—makes it challenging to disentangle the effectiveness of any one specific policy response. On the one hand, further data collection on policy compliance (e.g. through online or traditional survey methods, digital trace data collection, etc.) may help to elucidate specific effects of different policies. On the other hand, because most of the current study focused on a period before widespread vaccine availability (and little impact of more transmissible SARS-CoV-2 variants), the Fall 2020 semester may in fact have been an ideal time to pose the questions in this work. In sum, given the number of younger adults enrolled in IHEs, the increased mobility and international nature of this population, and the fact that this population is less likely to practice COVID-19 mitigation behaviors, campus testing represents another effective control policy that IHEs and counties should consider to continue keeping COVID-19 incidence low. Data & methods Data collection and sources County-level case data are from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [72]. County-level population and demographic data are from the 2018 American Community Survey (ACS) [73]. Weekly data for testing and case counts in Massachusetts cities are from the Massachusetts Department of Public Health [59]. Data about IHEs—including the number of full-time students and staff, campus location, institution type, etc.—come from the Integrated Postsecondary Education Data System (IPEDS) via the National Center for Education Statistics [74]. Data about individual IHEs’ plans for returning to campus (i.e., online only, in-person, hybrid, etc.) come from the College Crisis Initiative at Davidson College [37]. This dataset classifies IHEs based on the following categories, which we use to create three broader categories (in parentheses): “Fully in person” (primarily in-person), “Fully online, at least some students allowed on campus” (primarily online), “Fully online, no students on campus” (primarily online), “Hybrid or Hyflex teaching” (hybrid), “Primarily online, some courses in person” (primarily online), “Primarily in person, some courses online” (primarily in person), “Primarily online, with delayed transition to in-person instruction” (primarily online), “Professor’s choice” (hybrid), “Simultaneous teaching” (hybrid), “Some of a variety of methods, non-specific plan” (hybrid). We did not include “hybrid” IHEs in our analyses here, but they remain an interesting avenue for future research, which we strongly encourage using the Campus COVID Dataset. The campus COVID dataset The Campus COVID Dataset was collected through a combination of web scraping, manual data entry, or communication with administrators at IHEs. In sum, the process involved collecting thousands of URLs of the COVID-19 dashboards (or analogous website) of each of over 4,000 IHEs, which we then used for manual data collection, inputting time series of case counts and testing volume between August 1 and December 16, 2020. The data for each IHE is stored in its own Google Sheet (indexed by a unique identifier, its ipeds_id), the URL of which is accessible through a separate Reference sheet. For full details on the data collection process, see S1 Text. Statistical controls for mitigation policies While the two groups of counties—the “primarily in-person” vs. “primarily online” counties—are broadly similar across demographic categories (Fig B in S1 Text), there could still be underlying differences between the two groups that influence their different COVID-19 outcomes. For example, this could happen if the two groups differed in the extent to which they enacted mitigation policies (i.e., if there were a common variable influencing whether a given county introduced mitigation policies as well as whether IHEs in the county remained primarily online vs. in-person during the Fall 2020 semester). There are a number of possible sources of this variability, ranging from differences in population density [75], to differences in messaging from political leaders [76]. In the model below, we include the data about voting patterns in the 2020 presidential election in order to control for potential biases arising from differences in political behavior at the county level. To control for potential biases arising from differences in local mitigation policies, we assigned each county to an “active mitigation policies” score based on policy tracking data from the Oxford COVID-19 Government Response Tracker [57]. These are daily time series data indicating whether or not a number of different policies were active on each day for a given state. Not only does this dataset list the presence or absence of a given policy, it also includes information about the severity (e.g. restrictions on gatherings of 10 people vs. restrictions on gatherings of 100 people, or closing all non-essential workplaces vs. closing specific industries, etc.). From these indicator variables, Hale et al. (2021) define a summary “stringency index” that characterizes the daily intensity of the mitigation policies that a given region is undergoing over time. We include this “stringency index” variable in an Generalized Linear Model regression to quantify the extent to which this time series of policy measures—along with data about IHE testing and enrollment policy, demographic data about the county itself, and average temperature—predicts COVID-19-related deaths (Table 1). After controlling for the variables above, we continue to see a significant negative association between the amount of IHE testing conducted in a county and COVID-19-related deaths, with a 38-day lag. Model specification and further details about the construction and interpretation of the model can be found in S1 Text. Citation diversity statement Recent work has quantified bias in citation practices across various scientific fields; namely, women and other minority scientists are often cited at a rate that is not proportional to their contributions to the field [77–84]. In this work, we aim to be proactive about the research we reference in a way that corresponds to the diversity of scholarship in public health and computational social science. To evaluate gender bias in the references used here, we obtained the gender of the first/last authors of the papers cited here through either 1) the gender pronouns used to refer to them in articles or biographies or 2) if none were available, we used a database of common name-gender combinations across a variety of languages and ethnicities. By this measure (excluding citations to datasets/organizations, citations included in this section, and self-citations to the first/last authors of this manuscript), our references contain 12% woman(first)-woman(last), 21% woman-man, 22% man-woman, 38% man-man, 0% nonbinary, 4% man solo-author, 3% woman solo-author. This method is limited in that an author’s pronouns may not be consistent across time or environment, and no database of common name-gender pairings is complete or fully accurate. Supporting information S1 Text. Supporting information. Table A: Current status of the Campus COVID Dataset. In total, the Campus COVID Dataset includes data about more than 1,400 IHEs. To collect these data, we searched among over 2,719 IHEs; approximately 40% of these are IHEs with data that we could not find (because the IHE does not collect self-reported positive tests and/or does not conduct campus testing, etc.) or with data that we believe exists but was not being shared publicly by the IHE. There are over 971 IHEs with time series of testing and/or case counts for the Fall 2020 semester. If an IHE reported only cumulative testing or case counts, we classify it as “cumulative only”. Table B: Example template for inputting data. Each IHEs in the Campus COVID Dataset has a unique URL that leads to a dataframe with this structure. For each date that the IHE reports a number of new cases (“positive_tests” above) or new tests administered (“total_tests” above), we input that value in its corresponding row. For IHEs that report testing and case counts weekly, we insert the data at the first collection date, which makes for more accurate smoothing when performing 7-day averages. If the IHE only reports cumulative cases or tests for the Fall 2020 semester, we leave the “total_tests” and “positive_tests” columns blank and report the “cumulative_tests” and “cumulative_cases” in the “notes” column, which we extract later in the analyses. Table C: Description of variables in Table 1. Where appropriate, we use the “per 100k” designation—the variable’s value divided by county population, multiplied by 100,000. Here “log” refers to the natural log, which we apply to variables that follow heavy-tailed distributions (e.g. income and population density). Fig A: JSD between distributions of demographic variables. As we vary the threshold for inclusion into the two groups—counties with IHEs that returned primarily in-person for Fall 2020 and counties with IHEs that remained primarily online—the Jensen-Shannon Divergence also changes. We want to select the value for this threshold based on whatever minimizes the Jensen-Shannon divergence, on average. Fig B: Comparison of county-level demographics between groups. Here, we compare the two groups—counties with IHEs that returned primarily in-person for Fall 2020 and counties with IHEs that remained primarily online—based on distributions of (a) age, (b) race, (c) income, and (d) urban-rural designation. Error bars: 95% confidence intervals. Fig C: Map of counties included in matched analysis. With the exception of California, which includes many primarily online IHEs, there are very few regions where the counties are clustered based on campus reopening strategy. County and state boundary maps downloaded from the United States Census TIGER/Line Shapefiles [52]. Fig D: Distributions of the variables used in the regression in Table 1. Fig E: Example data: Northeastern University. Fig F: Example data: North Carolina State University. Fig G: Example data: University of California-Los Angeles. Fig H: Example data: Purdue University. Fig I: Example data: University of Miami. Fig J: Example data: Georgia Institute of Technology. Fig K: Example data: Duke University. Fig L: Example data: Ohio State University. https://doi.org/10.1371/journal.pdig.0000065.s001 (PDF) Acknowledgments The authors thank Kaitlin O’Leary, Representative Mindy Domb, Timothy LaRock, Daniel Larremore, Maciej Kos, Jane Adams, Rylie Martin, Addie McDonough, Anne Ridenhour, Benjy Renton, and Mike Reed for helpful discussions and additions to the dataset. N.G. acknowledges LA-UR-21-25928. [END] --- [1] Url: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000065 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/