(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Effect of SARS-CoV-2 digital droplet RT-PCR assay sensitivity on COVID-19 wastewater based epidemiology [1] ['Sooyeol Kim', 'Dept Of Civil', 'Environmental Engineering', 'Stanford University', 'Stanford', 'Ca', 'United States Of America', 'Marlene K. Wolfe', 'Rollins School Of Public Health', 'Emory University'] Date: 2022-11 Abstract We developed and implemented a framework for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology. We applied this framework to digital droplet RT-PCR measurements of SARS-CoV-2 and Pepper Mild Mottle Virus genes in wastewater. Measurements were made using 10 replicate wells which allowed for high assay sensitivity, and therefore enabled detection of SARS-CoV-2 RNA even when COVID-19 incidence rates were relatively low (~10−5). We then used a computational downsampling approach to determine how using fewer replicate wells to measure the wastewater concentration reduced assay sensitivity and how the resultant reduction affected the ability to detect SARS-CoV-2 RNA at various COVID-19 incidence rates. When percent of positive droplets was between 0.024% and 0.5% (as was the case for SARS-CoV-2 genes during the Delta surge), measurements obtained with 3 or more wells were similar to those obtained using 10. When percent of positive droplets was less than 0.024% (as was the case prior to the Delta surge), then 6 or more wells were needed to obtain similar results as those obtained using 10 wells. When COVID-19 incidence rate is low (~ 10−5), as it was before the Delta surge and SARS-CoV-2 gene concentrations are <104 cp/g, using 6 wells will yield a detectable concentration 90% of the time. Overall, results support an adaptive approach where assay sensitivity is increased by running 6 or more wells during periods of low SARS-CoV-2 gene concentrations, and 3 or more wells during periods of high SARS-CoV-2 gene concentrations. Citation: Kim S, Wolfe MK, Criddle CS, Duong DH, Chan-Herur V, White BJ, et al. (2022) Effect of SARS-CoV-2 digital droplet RT-PCR assay sensitivity on COVID-19 wastewater based epidemiology. PLOS Water 1(11): e0000066. https://doi.org/10.1371/journal.pwat.0000066 Editor: Ricardo Santos, Universidade Lisboa, Instituto superior Técnico, PORTUGAL Received: August 31, 2022; Accepted: October 25, 2022; Published: November 16, 2022 Copyright: © 2022 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All data are submitted to the Stanford Digital Repository and are available publicly. The link to the data is: https://exhibits.stanford.edu/data/catalog/km637ys9238. Funding: This study was supported by the CDC Foundation (to ABB), NSF RAPID (CBET-2023057 to ABB), and by the Epidemiology and Laboratory Capacity for Infectious Diseases Cooperative Agreement (no. 6NU50CK000539-03-02 to CDPH colleagues) from CDC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: D. Duong, V. Chan-Herur, and B. White are employees of Verily Life Sciences. Introduction Wastewater-based epidemiology for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is becoming an increasingly important tool in monitoring coronavirus disease 2019 (COVID-19) incidence rates in communities. Monitoring programs that use samples from publicly owned treatment works (POTWs) [1, 2] and from sewer conveyances [3] have actively been implemented to supplement clinical testing data. Wastewater surveillance has the advantage of providing insights into population health by overcoming limitations of clinical testing, such as test seeking behavior and test availability. Wastewater surveillance has also been used to gain insight into the epidemiology of other respiratory viruses such as influenza A virus [4] and respiratory syncytial virus [5] (RSV), as well as gastrointestinal pathogens such as hepatitis A virus [6] and Salmonella [7]. Therefore, wastewater surveillance is likely to become increasingly useful for assessing various aspects of community health beyond COVID-19. SARS-CoV-2 RNA concentrations in wastewater, whether measured in the solid or liquid phase or using quantitative or digital reverse-transcription polymerase chain reaction (RT-PCR), correlate to COVID-19 laboratory incidence rates in the population contributing to the wastewater [8–10]. The lower detection limit, or the sensitivity, of any method for detection of SARS-CoV-2 RNA, or any other disease target, will dictate the lowest levels of disease occurrence that can be detected using wastewater. Digital (RT-)PCR can be a sensitive method for detecting disease targets in wastewater. Digital (RT-)PCR methods divide the entire (RT-)PCR solution (master mix, primers, probes and template) into a large number of partitions (droplets or physical partitions in a plate) such that each partition likely contains only one copy of template nucleic acid. By increasing the number of partitions (assuming their volume remains constant) and the associated reaction volume, the analytical sensitivity of the measurement can be increased. For most digital (RT-)PCR platforms, the number of partitions can be increased by increasing the number of wells on a 96-well plate used to analyze samples; the results from partitions generated by all replicate wells are merged to compute the final measurement. In this context, a well is an aliquot of fluid from which partitions (droplets for droplet digital PCR) are generated. However, increasing the number of wells used for each sample to improve sensitivity increases project reagent costs. In a previous study, we compared the lowest measurable concentration of SARS-CoV-2 RNA reported by different laboratories using different pre-analytical methods and digital RT-PCR, and the detection limit decreased as the number of replicate wells used in the method increased [11]. Authors have reported using between one to ten merged wells without full justification on how the number of wells merged was selected [2, 11, 12]. To survey studies that investigated effect of replicate wells on digital (RT-)PCR sensitivity on environmental sample measurements, we conducted a literature review with the following keywords using Web of Science on July 19, 2022: (((ALL = (air OR soil OR water OR wastewater OR environment*))) AND ALL = ("digital PCR" OR dPCR OR ddPCR OR "digital droplet PCR" OR RTddPCR OR RTdPCR OR dRTPCR OR ddRTPCR OR "digital droplet RT-PCR" OR "digital RT-PCR’’ OR RT-ddPCR OR RTdd-PCR OR dd-RTPCR OR ddRT-PCR)) AND ALL = (sensitivity OR modeling OR simulation OR "detection limit"). This review resulted in 216 search results: 104 studies passed initial title and abstract screening where the inclusion criteria required a description of digital (RT-)PCR method development or investigation of digital (RT-)PCR sensitivity, and only one study [13] passed the full text screening. This single study [13] proposed statistical models that describe relationships between different digital PCR parameters including number of replicate wells used; however, the study was mostly theoretical and focused on clinical case studies. The other 107 studies mainly focused on comparing digital (RT-)PCR sensitivity to that of other methods like q(RT-)PCR; a few of the papers mentioned the need for replicate PCR samples [14] or the limit of small reaction volume for digital PCR [15–19], with one study pointing out that replicate wells can be merged retrospectively for an increase in sensitivity [20]. To date, no study has empirically investigated how the number of replicate wells affects digital (RT-)PCR sensitivity in environmental samples, and how the resultant sensitivity affects applications of the technology for public health decision making. The present study aims to fill this knowledge gap. Given the ongoing importance of wastewater surveillance for COVID-19 disease monitoring and its potential use for other disease surveillance, and increasing use of digital (RT-)PCR, it is important to better understand how analytical measurement sensitivity is controlled by increasing the number of (constant volume) partitions, and also how this change in sensitivity affects the use of the measurements for wastewater-based epidemiology applications. To achieve this aim, we measured SARS-CoV-2 RNA and Pepper Mild Mottle Virus (PMMoV) RNA in wastewater solids samples using digital droplet RT-PCR (ddRT-PCR) using 10 replicate wells. We then computationally downsampled the wells to investigate how the number of wells, and thus partitions, affects the lowest measurable concentrations of SARS-CoV-2 and PMMoV genes and associations between SARS-CoV-2 gene measurements and disease incidence rates. The framework developed herein for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology is generalizable to other infectious agents and other analytical approaches for measuring molecular targets. Materials and methods POTWs and data collection Data used in this study was obtained from an on-going SARS-CoV-2 wastewater monitoring program in California, USA described by Wolfe et al. [2] Samples were collected and processed daily between June 1, 2021 to August 31, 2021 from four POTWs: 80 samples from City of Davis Wastewater Treatment Plant (Dav) in Davis, 83 samples from South County Regional Wastewater Authority Wastewater Treatment Plant (Gil) in Gilroy, 75 samples from Oceanside Water Pollution Control Plant (Ocean) in San Francisco, and 89 samples from San Jose-Santa Clara Regional Wastewater Facility (SJ) in San Jose (listed in the order of size from smallest to largest). Further details on the POTWs and sampling procedures can be found in Wolfe et al. [2] and in Table A in S3 Text. The data reported herein has not been previously published. The solids samples were processed within 24 hours of collection exactly according to the methods described by Wolfe et al. [2] and are summarized in the S1 Text. In brief, dewatered solids were suspended in a buffer, and then 10 replicate aliquots of the buffer containing a suspension of solids were subjected to RNA extraction and purification, followed by inhibitor removal using commercial kits. The RNA from each of the 10 replicates was assayed in a single 20 μL ddRT-PCR well (10 replicate wells total per sample) to determine SARS-CoV-2 N gene and PMMoV gene concentrations; we also measured recovery of spiked-in bovine coronavirus. The approach of using 10 replicate RNA extracts from 10 replicate solids aliquots per sample allows us to account for variability inherent in the wastewater solids matrix, as viral RNA may be dispersed heterogeneously throughout the matrix. While we recognize that some laboratories may not have the resources for such an approach, we believe biological replication is superior to technical replication for environmental samples owing to the inherent variability of complex environmental matrices. EMMI reporting guidelines [21], which promote transparency in methodologies, and results of controls, were followed in our descriptions below. Data for each individual well was downloaded from QuantaSoft Analysis Pro software (BioRad, CA, version 1.0.596). Samples for which all wells did not have at least 10 000 generated droplets in each well were eliminated from our analysis. This eliminated a total of 41 SARS-CoV-2 N gene measurements (12 from Dav, 9 from Gil, 17 from Ocean, and 3 from SJ), resulting in 327 measurements for further analysis (89 from Dav, 80 from Gil, 83 from Ocean, 75 from SJ). COVID-19 epidemiology data Laboratory confirmed incident cases of COVID-19 as a function of episode date was obtained as described previously [2]; see S2 Text for details. Downsampling simulation In order to estimate the SARS-CoV-2 N gene and PMMoV RNA concentration we would have obtained if we had run a smaller number of wells (X = 1–9), we randomly selected X wells from the 10 wells to calculate the resultant concentration: where 0.00085 μL is the volume of a single droplet [22]. If the total number of positive droplets was less than three, the concentration was denoted as not detected (ND). A thousand simulations were conducted for each possible number of merged wells (X = 1–9) for each sample. The resulting concentrations were converted to units of cp/g dry weight using dimensional analysis [2]. From these thousand simulations, we calculated 1) the percent of the simulations that resulted in less than 3 positive droplets across merged wells and was designated as ND, 2) the median concentration, and 3) its interquartile range (25th and 75th percentiles). No substitution for ND was made, and it was noted whether the median or interquartile range included ND. Similar analyses were performed for ten randomly selected PMMoV measurements from each POTW; each measurement had at least 10 000 total droplets generated in each well. Simulations were conducted using R (version 4.0.4) implemented using RStudio (version 1.4.1106). Statistical analysis Statistics were computed using R in conjunction with RStudio (see above), using packages pracma and tidyverse for data analysis, and ggplot2 for data visualization. Shapiro-Wilk test was used to determine whether simulation outputs were normally distributed. The dispersion of the simulation outputs is defined by the interquartile range (IQR). The relative dispersion of the simulation outputs is described as the ratio of the median and the interquartile range. The dispersion and relative dispersion of the simulated concentrations were compared to the standard deviation of the concentration, or the standard deviation normalized by the concentration, respectively, derived from the measurement obtained using 10 wells. The standard deviation of that measurement is the 68% confidence interval as defined by the total error from the instrument software which includes Poisson error and variation among wells; the total error formula is proprietary and not available from the vendor. Nonparametric Kendall’s tau was used to assess the association between the N gene concentrations in wastewater and laboratory confirmed COVID-19 incidence rates. Kendall’s tau was calculated for both the entire time series and the low incidence month of June. Half of the theoretical lower measurement limit for each number of wells was substituted for measurements considered NDs. Half of the theoretical lower measurement limit was chosen to represent an average of all the concentrations below the theoretical lower measurement limit. The theoretical lower measurement limit was calculated for each number of merged wells (X = 1–10) by: 1) calculating the concentration resulting from three positive droplets total across merged wells out of 20 000 total accepted droplets (theoretical number of droplets generated) per each well merged, and 2) converting the concentration to units of cp/g dry weight using average solid content for samples from each POTW (Table B in S3 Text). Linear regression was used to derive an empirical relationship between log 10 -transformed COVID-19 laboratory-confirmed incidence rates and log 10 -transformed measured SARS-CoV-2 N gene concentrations using data obtained by merging ten wells; relationships were quantified for each POTW separately, and for the POTWs in aggregate. For the linear regression, NDs were substituted with half the theoretical lower measurement limit calculated for X = 10. Using the empirical relationship between incidence rate and SARS-CoV-2 RNA concentration for the associated POTW, the lowest detectable COVID-19 incidence rate was estimated based on the calculated theoretical lower measurement limits. A logistic regression was used to model the fraction of samples that were assigned a concentration (versus assigned ND) for X = 1–9 as a function of the true concentration of the sample, defined as the concentration obtained using 10 wells. The concentration corresponding to a detection frequency of 0.5 (C 0.5 ) was calculated using the regression equation. For this analysis, half of the theoretical lower measurement limit was substituted for the 6 NDs for X = 10. All code for simulations and statistics is available through the Stanford Digital Repository (https://purl.stanford.edu/km637ys9238). The Institutional Review Board of Stanford University determined that this project does not meet the definition of human subject research as defined in federal regulations 45 CFR 46.102 or 21 CFR 50.3 and indicated that no formal IRB review is required. Discussion With the growing interest in application of wastewater-based epidemiology to various infectious diseases, it is crucial to understand how sensitivity of the assay used to measure infectious disease targets impacts the ability to use wastewater measurements to represent community disease burden. The COVID-19 pandemic provides a unique opportunity to investigate this relationship as active disease surveillance during the first year and a half of the pandemic provides relatively robust disease incidence data [24]. In this study, we developed a framework for investigating how the number of merged replicate wells in digital RT-PCR affects the lowest measurable concentration and number of non-detects, and in turn influences the use of the measurements to detect low incidence rates in the community, or to infer trends in disease occurrence. Understanding this relationship is important in optimizing surveillance efforts for COVID-19 and other infectious diseases. We measured concentrations of SARS-CoV-2 and PMMoV genes daily in wastewater settled solids at four POTWs in California using ddRT-PCR during a period of time that included both low (~ 10−5) and high (>10−4) COVID-19 incidence rates. We used 10 merged wells for the measurements, and then determined how the measurement would have been affected by using fewer than 10 wells through a down-sampling scheme. Our findings indicate that when a large fraction of droplets are positive (> 5% positive), as was observed for PMMoV, a virus found in high quantity in human stool and wastewater [25], concentrations measured using just one well are similar to those obtained using ten when considering the variability associated with the measurements. On the other hand, when a smaller fraction of droplets are positive (< 0.5%), as was the case for the SARS-CoV-2 gene measurements and expected for other human viral gene targets, using fewer wells can result in measurements that may vary from those obtained using 10 wells and produce more measurements characterized as non-detects. For the SARS-CoV-2 gene measurements, variability in measurements increased as the number of wells decreased. Generally, we found that when the fraction of positive droplets was greater than 0.024% (corresponding to a conservative approximate concentration of 104 cp/g dry weight), that the variability in the measurement resulting from using 3 or more wells was similar or smaller than the measurement total error obtained using 10 wells. In contrast, when the fraction of positive droplets was less than 0.024%, the variability in the measurements resulting from using 6 or more wells was similar or smaller than the measurement total error obtained using 10 wells. These results could guide adaptive analysis plans that use fewer wells to reduce costs when concentrations of SARS-CoV-2 are relatively high. The probability of obtaining a non-detect increased as the SARS-CoV-2 gene concentration decreased and the number of wells used in ddRT-PCR decreased. Using logistic regression, we identified the concentration at which the detection frequency was less than 0.5 (C 0.5 ), and this value varies inversely with the number of wells; that is C 0.5 is higher when fewer wells are used for ddRT-PCR. This means that when low concentrations of SARS-CoV-2 genes are expected, using too few wells can result in a large number of non-detects. For example, during the low incidence period of June, if only 1 well had been used instead of 10, 92 of the 113 measurements across four POTWs would have been below C 0.5 . We found that at least 6 wells were needed to achieve 90% of the measurements to be above C 0.5 for June 2021. Consistent with other studies, the wastewater concentration showed positive and significant correlation with 7-day smoothed COVID-19 incidence rates [1–3, 8–12]. When there was variation in COVID-19 incidence rates within the time frame being investigated (here before and during the Delta variant surge), the number of wells being used for the analysis did not affect the magnitude or statistical significance of the correlation. There was a positive and significant correlation even when using only one well because there was enough variation in both variables, although the majority of June measurements were characterized as non-detects. This illustrates that finding a significant correlation between disease incidence and SARS-CoV-2 gene concentrations does not necessarily indicate good measurement sensitivity. It should be noted that while we take the laboratory confirmed COVID-19 incidence rates to be reflective of the level of COVID-19 that the community is experiencing, they are likely an underestimate of incidence rates in the sewershed as the reported incidence rates are dependent on test-seeking behavior and test availability [26]. The results described here on the effect of the number of wells used for ddRT-PCR on sensitivity of PMMoV and SARS-CoV-2 measurements are extendable to other platforms and other gene targets. Increasing the number of wells is analogous to increasing the volume of the PCR reaction (for any PCR method) and increasing the number of (constant volume) partitions for digital PCR applications. Although uncommon, some researchers have previously also used a similar approach to increase sensitivity of qPCR by adding the resulting concentration of replicates [27]. Similarly, the recommendations for increased sensitivity herein apply to other gene targets. Generally, for any high copy number target, like PMMoV, increased sensitivity is generally not needed, so efforts to improve sensitivity through replication are unnecessary. Examples of other high copy targets in wastewater matrices include the 16S rRNA and crAssphage genes. For lower copy number targets, or rare targets, increased sensitivity is likely needed particularly if the results will be used for disease surveillance. Examples include other viral targets like norovirus and rotavirus RNA or bacterial targets like those for Salmonella or Campylobacter. There are a few limitations of this analysis. First, in our analysis, we assumed that the measurement obtained using 10 wells is the “true concentration” and compared all results simulated with fewer than 10 wells to the true concentration and its error from the ddRT-PCR instrument. Second, the results presented herein regarding assay sensitivity, and in particular the C 0.5 values in Table 1 are specific to the methods applied in this study. The relationship between the number of wells used to the number of non-detects, and the lowest measurable concentration will be impacted by the pre-analytical and analytical processes used. Additionally, we were able to do a large number of replicates, each with its own extraction to embrace the variability one might expect in environmental samples, which not all labs may be capable of due to cost constraints. However, we would not expect the general trend of reduced sensitivity with fewer merged PCR wells to change if the replication scheme was different. Although the specific values in Table 1 are only extendable to other studies using our methods (available on protocols.io [28–30]), the framework for examining the required sensitivity for wastewater surveillance is extendable to all studies. That is, careful attention to how sensitivity affects the lowest measurable concentration and the number of non-detects, as well as the relationships between these values and laboratory confirmed COVID-19 incidence rates is needed to fully understand how decisions on assay implementation are made. Conclusions We developed and implemented a framework for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology. The framework involves understanding how assay sensitivity affects lowest measurable concentrations in units of copies per environmental matrix mass, and the detection probability of a target that is present; and how this change during periods of different disease occurrence can affect resultant statistical associations between the viral target and measures of disease incidence. We applied this framework to digital droplet RT-PCR (ddRT-PCR) measurements of a SARS-CoV-2 gene made using 10 replicate wells, and determined how using fewer wells affected assay sensitivity and its performance for wastewater-based epidemiology applications. From a reagent cost savings perspective, we recommend an adaptive analytical approach where assay sensitivity is increased by running more replicate wells (6 or more) during periods of low SARS-CoV-2 gene concentrations (using our methods, < 104 cp/g) and COVID-19 incidence rate (< 3.5/100 000) and fewer replicate wells (3 or more) during periods of higher SARS-CoV-2 RNA concentrations and COVID-19 incidence. While the precise recommendations here are only generalizable if one is using the same pre-analytical and analytical protocols, the framework and the conclusion that adaptive approaches can reduce costs and increase sensitivity during periods of low disease incidence can be applied to other methods and other wastewater-based epidemiology targets. Acknowledgments This study was supported by the CDC Foundation, NSF RAPID (CBET-2023057), and by the Epidemiology and Laboratory Capacity for Infectious Diseases Cooperative Agreement (no. 6NU50CK000539-03-02) from CDC. We thank the California Department of Public Health COVID-19 Wastewater Surveillance, Epidemiology and Data teams for their help with COVID-19 incidence data. We thank Dr. Linlin Li, Michael Balliet, Dr. Pamela Stoddard and Dr. George Han at the County of Santa Clara Public Health Department for provision of case data. Numerous people contributed to sample collection including including Payak Sarkar (SJ), Noel Enoki (SJ), Amy Wong (SJ), Alexandre Miot (Ocean), Lily Chan (Ocean), the Oceanside plant operations personnel, Saeid Vaziry (Gil), Chris Vasquez (Gil), and Jeromy Miller (Dav). [END] --- [1] Url: https://journals.plos.org/water/article?id=10.1371/journal.pwat.0000066 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/