(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org. Licensed under Creative Commons Attribution (CC BY) license. url:https://journals.plos.org/plosone/s/licenses-and-copyright ------------ A nowcasting framework for correcting for reporting delays in malaria surveillance ['Tigist F. Menkir', 'Center For Communicable Disease Dynamics', 'Department Of Epidemiology', 'Harvard T.H. Chan School Of Public Health', 'Harvard University', 'Boston', 'Massachusetts', 'United States Of America', 'Horace Cox', 'Vector Control Services'] Date: 2022-01 Abstract Time lags in reporting to national surveillance systems represent a major barrier for the control of infectious diseases, preventing timely decision making and resource allocation. This issue is particularly acute for infectious diseases like malaria, which often impact rural and remote communities the hardest. In Guyana, a country located in South America, poor connectivity among remote malaria-endemic regions hampers surveillance efforts, making reporting delays a key challenge for elimination. Here, we analyze 13 years of malaria surveillance data, identifying key correlates of time lags between clinical cases occurring and being added to the central data system. We develop nowcasting methods that use historical patterns of reporting delays to estimate occurred-but-not-reported monthly malaria cases. To assess their performance, we implemented them retrospectively, using only information that would have been available at the time of estimation, and found that they substantially enhanced the estimates of malaria cases. Specifically, we found that the best performing models achieved up to two-fold improvements in accuracy (or error reduction) over known cases in selected regions. Our approach provides a simple, generalizable tool to improve malaria surveillance in endemic countries and is currently being implemented to help guide existing resource allocation and elimination efforts. Author summary Infectious disease monitoring systems worldwide are often threatened by severe reporting delays, hindering resource planning efforts, particularly for endemic diseases like malaria and in remote localities. In Guyana, a country located in South America, regions facing infrastructural barriers and consisting of highly mobile or socially disadvantaged populations oftentimes face the greatest delays in reporting local health facility data to the country’s central database. To better characterize and address this issue, we analyzed time and spatial trends in malaria reporting delays across administrative regions and implement methods to transform incomplete case data to more accurate current case counts. We first identify factors which significantly spatially overlap with areas marked by increased reporting delays. We then introduce methods which build on prior case information only available at the time of the target month of interest to anticipate the total number of cases to occur that month, a fraction of which are not known until the subsequent month(s). We compared our regional monthly estimates with eventually reported cases and found that they were up to two times more accurate than case counts known at the month of reporting. Our models provide a widely accessible and adaptable tool for improving on malaria surveillance efforts. Citation: Menkir TF, Cox H, Poirier C, Saul M, Jones-Weekes S, Clementson C, et al. (2021) A nowcasting framework for correcting for reporting delays in malaria surveillance. PLoS Comput Biol 17(11): e1009570. https://doi.org/10.1371/journal.pcbi.1009570 Editor: Claudio José Struchiner, Fundação Getúlio Vargas: Fundacao Getulio Vargas, BRAZIL Received: March 29, 2021; Accepted: October 18, 2021; Published: November 16, 2021 Copyright: © 2021 Menkir et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All input data and code is available at https://github.com/goshgondar2018/guyana_nowcasting. Funding: TFM was supported by the National Institute of General Medical Sciences of the National Institutes of Health [U54GM088558]. PMD was supported by the Fellowship Foundation Ramon Areces. MS was partially supported by the National Institute of General Medical Sciences of the National Institutes of Health [R01 GM130668]. MS and CP acknowledge that they have received institutional research funds from the Johnson and Johnson Foundation. COB was supported by a NIGMS Maximizing Investigator’s Research Award (MIRA) [R35GM124715-02]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Author Sharon Jones-Weekes was unable to confirm their authorship contributions. On their behalf, the corresponding author has reported their contributions to the best of their knowledge. Introduction Malaria continues to be one of the most important causes of morbidity and mortality globally—with 405,000 deaths estimated to have occurred in 2018—particularly in poor and rural populations [1]. National malaria surveillance programs depend on timely reporting rates to identify signals of increasing malaria activity and potential outbreaks to ensure that control resources are optimally and most efficiently deployed [2]. However, major barriers to timely surveillance in endemic countries—particularly those in which malaria transmission is concentrated in the most remote, hardest to reach populations—include limited connectivity, modes of digitisation and other context-specific factors that undermine the ability to update surveillance records without significant delays [3]. While infrastructure development is critical to improving the timeliness of case reporting, methodological approaches to adjust for known reporting lags are urgently needed until data systems improve. One such endemic context is Guyana, located in the northern coast of South America and bordering Venezuela, Suriname, and Brazil; one of 21 malaria endemic countries in the Americas [4]. Guyana has a relatively small, sparse population of ~800,000 people [5], with highly seasonal malaria endemicity concentrated in rainforest regions with low accessibility, often only reachable by small plane or boat. While confirmed cases have steeply declined from 2015 to 2017 in South America overall, Guyana has experienced a near 40% escalation in confirmed cases during that period, with 13,936 cases confirmed in 2017 [4]. More than half of infections are caused by Plasmodium vivax; however, the second most common (44%) infecting species, P. falciparum, is tied to more severe outcomes and has been implicated in recent reports of resistance to Artemisinin drugs [6–8]. The recent growth in case counts overall is partially driven by increased importation from surrounding areas, such as Venezuela, linked with migration from social and political crises and to promising employment opportunities in mining [4,9]. In response to the elevated burden of malaria in the nation, the government has boosted investment in malaria prevention, under the direction of the Vector Control Services (VCS) [4,10]. Like many national malaria control programs, one of the priority areas of VCS is surveillance, where regional health offices and the central office in the capital of Georgetown engage in a complex surveillance cascade. Cases are largely identified through a passive surveillance system, i.e. patients presenting with symptoms are diagnosed at local health facilities, and rarely through alternative community testing or outreach programs; a mere 8% of all cases are detected through these auxiliary programs. Health facilities in endemic regions are then required to send their daily case registers for all malaria patients to their corresponding regional health office for processing, before being submitted to the central office in the capital of Georgetown, while health facilities in non-endemic regions can simply send reports to Georgetown [11]. However, we find that many health facilities in endemic areas opt to send their reports directly to Georgetown or through alternative routes, due to issues in road access and other challenges, often resulting in delayed or missed reports (personal communication). Furthermore, while the data is intended to be documented in the system in near real-time, with summary reports sent from regions to the central office via USBs, physical copies of individual case registers from facilities are sometimes delivered to the central office, leading to additional bottlenecks in reporting (personal communication). Together, these challenges contribute to the number of cases recorded in the central database at the end of the month potentially failing to reflect the true burden of disease at a given time, with cases typically reported more than a month after they occur (See Fig 1B). These issues are not only common for malaria control and surveillance programs, but for many other epidemiological reporting systems, hindering the ability of public health programs to implement appropriate and optimal measures for control [3,12–16]. Modeling efforts to translate available reporting data to more meaningful and accurate measures of real-time incidence could help address some of these challenges. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 1. A. Map of malaria endemic regions and region 4; B. Boxplot reporting the proportion of annually aggregated cases from 2006–2019 reported by the end of the month for each region, with colors corresponding to the regions depicted in the map. Map created using the R package ssplot. https://doi.org/10.1371/journal.pcbi.1009570.g001 A number of analytical approaches aimed at improving estimates through “nowcasting” have been developed to account for reporting delays (frequently referred to as “backfill”). The most widely used are Bayesian estimation and regularized regression, which take advantage of the fact that the timing of delays may be relatively predictable. The Bayesian approach, broadly considered the paragon approach, encapsulates both delay trends given prior known cases and epidemiological trends to estimate disease occurrence and mortality, with several recent extensions implemented to better uncover the true magnitude of COVID-19 cases and deaths [14,15,17–23]. Extensions of these methods, proposed by Bastos et al., Rotejanaprasert et al., and Kline et al. additionally accommodate spatial trends in transmission and delay dynamics [14,17,19]. Relatedly, Reich et al. use a generalized additive model to estimate Dengue Hemorrhagic Fever cases in Thailand, modeling incidence at each time point in a given province as a function of its preceding reported cases and those of provinces with similar activity [16]. Penalized regression methods, such as the ARGO, Net, and ARGONet models proposed by Santillana, Lu, and others have also been used to incorporate a diversity of case reporting and alternative data sources, to learn from previous time trends in delays and transmission, and the spatial dependencies in these trends, to predict rates of influenza-like illness [24]. However, this class of models is only well-suited for contexts in which these varied digital sources are readily available and easily integrated into routine model training, testing, and validation efforts [24]. To address challenges in reporting delays for malaria surveillance in Guyana, we developed nowcasting approaches to estimate monthly cases of malaria in each endemic region, using data from the central VCS database from 2006 to 2019 to develop and test our model. We show that proximity to high-risk vulnerable populations, namely mining sites and Amerindian populations, is associated with reporting delays and that delays are relatively consistent over time. Importantly, our predictions provide extensive improvements in surveillance capacity for remote areas and are presently being used by VCS to help characterize prevailing case counts across regions in the face of added challenges due to COVID-19, to help facilitate the allocation of malaria prevention and control interventions and overall planning initiatives centered on elimination. Historically, malaria has affected groups that primarily live and work in Regions 1, 7, 8, and 9, otherwise known as hinterland regions because they are sparsely populated, forested areas that are difficult to navigate. These characteristics contribute to the ongoing challenge of time lags in reporting of malaria cases which threaten the delivery of an appropriate public health response. The nowcasting tool generates estimates of current malaria burden at the sub-national level based on real-time reporting and a contextual understanding of the malaria epi-trends over the past years. Thus, we illustrate how nowcasting methods offer a general, tractable approach for improving decision-making for malaria control programs in countries that have significant reporting delays. Methods Ethics statement The Ministry of Public Health, Guyana granted access to non-identifiable surveillance data obtained from the Malaria Program /Vector Control Services under IRB exemption. Further, IRB exemption was obtained from Harvard School of Public Health (US), Human Research Protection Program, Protocol number IRB18-1638. We had access to daily case register data from 2006 to 2019 documented at health facilities in each of Guyana’s ten regions and compiled at the central office in Georgetown (region 4), which consists of demographic information, results from tests concerning the identity of the infecting parasite species, and the dates when the patient’s smear was taken, examined, and inputted in the central office. Delays are defined as the days elapsing from when a patient is registered as a case at a health facility and when the patient is documented at the central VCS database. We excluded the 5.7% of observations with implausible recorded date ranges, i.e. individuals whose smear was examined in a health facility following their documentation in the central office in Georgetown. A majority of the 28% of cases indicated as rechecks were noted to have been infected by P. vivax, or mixed infections with P. vivax (64%), as expected, given that P. vivax is the species largely implicated in relapses [25]. While we acknowledge that the inclusion of patients with recurrent infections may hinder a mechanistic analysis of potential transmission patterns in each region, the primary goal of this analysis is to learn from prior clinical reporting trends to arrive at more informed estimates of current case loads, which include all patients, in the face of reporting delays. We assessed the potential influence of rainfall on reporting delays by analyzing precipitation data sourced from East Anglia University’s Climate Research Unit’s gridded dataset repository. Monthly total precipitation data was extracted for each region by year [26]. Locations of second-level administrative regions, or neighborhood democratic councils (NDCs) (n = 116) were identified through a shapefile of NDCs from GADM version 1 (accessed through DIVA-GIS) [27]. We defined NDCs, rather than regions, as our spatial unit of analysis to ensure adequate statistical power, and aggregated median delays across years by locality for each NDC. Connectivity was captured through measures of accessibility (defined as travel times to the “nearest urban centre”) for each region in 2015 estimated in Weiss et al [28]. A shapefile of mapped mines was drawn from the 2005 U.S. Geological Survey Mineral resources data system [29]. Finally, a shapefile of Amerindian settlements was obtained from the LandMark Global Platform of Indigenous and Community Lands through the GuyNode Spatial Data portal [30]. Descriptive analysis We first assessed the stationarity of each region’s monthly delay distributions from 2006 to 2019 through an Augmented Dickey-Fuller test. We further examined potential synchronicities between regions’ monthly delay distributions, using the mean cross-correlation coefficient between the regional time series for monthly delays from 2006 to 2019. For each region, we additionally estimated (a) the Pearson correlation between median reporting delays and regional connectivity in 2015, and (b) the Pearson correlation and cross-correlations between monthly total precipitation levels and median delays from 2006–2019. We focused our analysis on case register data from region 4, where many malaria patients come for treatment but there is no local transmission, and from malaria-endemic regions 1, 7, 8, and 9 (see Fig 1A for map). We used a global Moran’s I index [31], to evaluate if delays across NDCs from 2017–2019 (the most recent three years of data) across all ten regions were spatially autocorrelated. This measure estimates whether, and to what extent, observations in the same neighborhood—defined using a range of possible criteria—share “features” which would indicate spatial autocorrelation [31,32]. We further used the Getis-ord statistic, which compares the sum of characteristics in each neighborhood to their overall mean [32,33], to identify local clusters of NDCs with higher reporting delays. To pinpoint the potential correlates of the observed spatial patterns in delays, we computed the local bivariate Moran’s I statistic for NDC-level delays and density of Amerindian settlements and for NDC-level delays and density of mines. We chose not to compute the local bivariate Moran’s I for NDC-level delays and connectivity, which we expect to exhibit a particularly strong spatial relationship with delays, given that we were restricted to data on connectivity at the scale of regions. For all spatial analyses, we omitted the 2.9% of patients whose localities could not be geocoded. Spatial weights were defined using the queen’s contiguity criterion of order = 1. All significance maps for local autocorrelation tests that we provide report FDR-adjusted p-values. Nowcasting methods We used the following two approaches to estimate revised (i.e. eventually reported) monthly malaria case counts in regions 1,4,7,8, and 9. In all of the proposed approaches, our real-time malaria case count estimations were produced via a data assimilation process, in that only information that would have been available at the time of estimation was used to train our models. The first ‘out-of-sample’ case count estimate for all regions was produced for January of 2007 using historical information available at the time (training set time period) that consisted of data from the previous 12 months, i.e. within 2006. Subsequent estimates were produced by dynamically training the models below, such that as more information became available the training set consisted of a new set of (12-month long) observations. This approach was chosen as a better way to capture the potential time-evolving nature of reporting delays. For all models, we centered the covariates to ensure standardization of inputs. Synchronous data imputation model (DIM) For each region i, we used an elastic net penalized regression, i.e. an ordinary least squares regression (OLS) subject to a combined L1- and L2- penalty [34] assuming a gaussian error distribution to estimate the revised case count for a given month t. The model was fit to data on the number of cases occurring in months t to t-12 that were known by the end of month t and takes the following form (1), where denotes the number of cases occurring in each month t* known by the end of month t. Note that the number of known cases for a given month t increases with the month of separation, as more information accumulates over time. For example, as we would expect to have more information about cases occurring in the prior month than cases occurring in the current month, would exceed . (1) where the h i ′s denote the estimated coefficients for each input variable Data inputs and method of training and testing the simple data imputation model A matrix of case counts from each of the twelve months prior to and including time t (the most up to date case count) that were known by the end of time t was used to dynamically train and test the data imputation model in order to predict the number of revised cases for t. Our models were assessed in their ability to accurately predict the revised case counts, only available at time t+m, where m was typically more than a month later, and were compared to the case counts available at time t. We computed a 3-fold cross validation within each twelve month window in order to identify the tuning parameters that resulted in the lowest mean square error and used these parameters for the subsequent (unseen) monthly prediction. In other words, training of our models was conducted using strictly data that would have been available at the time of prediction. Given the size of the training set was 12 observations, we chose a small number of folds, n = 3 for our cross-validation implementation. We experimented with different number of folds (3, 5, or 10) for cross-validation to explore whether this choice would have an impact on the quality of our predictions and found that these additional choices led to qualitatively similar results, so we report the results of the most cost-efficient approach in the manuscript. Synchronous network models (NM) For each region i, an elastic net regression estimating the revised case count for a given month t was calibrated to data on the number of cases occurring in months t to t-12 that were known by the end of month t, the number of cases occurring in months t-1 to t-12 that were known by the end of the month t in all other regions j≠i, and the total precipitation level in region i at month t, a hypothesized cofactor of reporting activity. We also implemented an elastic net regression which additionally takes as inputs the predicted number of cases in month t for all other regions j from the previous data imputation models. Both network models consider the location where malaria activity is estimated as a node in a network potentially influenced by malaria activity happening in other (potentially neighboring) locations (nodes). The first network model (NM1) takes the following form (2), where θ(t) denotes the total precipitation level for region i at time t. (2) The second network model (NM2) takes the following form (3), where y j (t) denotes the predicted case counts for region j at time t, estimated from the data imputation model for region j (1) (3) Please refer to S1 Diagram for a schematic visualizing the distinct and shared inputs of the two network models. Data inputs and method of training and testing the two network models A matrix of case counts from each of the twelve months prior to and including t that were known by the end of t for region i, case counts from each of the twelve months prior to t that were known by the end of t for all other regions j, and total precipitation levels for region i in month t, was used to dynamically train and test NM1 in order to predict the number of revised cases for t for region i. A matrix of case counts from each of the twelve months prior to and including t that were known by the end of t for region i, case counts from each of the twelve months prior to t that were known by the end of t for all other regions j, DIM-predicted case counts in t for all other regions j, and total precipitation levels for region i in t was used to dynamically train and test NM2 in order to predict the number of revised cases for t for region i. We replicate the same procedure as in the data imputation model to test and train the two network models. We generated ensemble model estimates, consistent with the first ARGONet procedure described in Poirier et al35 by assigning the point estimate for a given month to that of the DIM, NM, and NM2, depending on which model yielded the lowest rRMSE for the prior three months [24,35]. We assigned the mean across models as the point estimate for the first three months of observations. To provide decision-makers with a prospective uncertainty quantification (error bars) of our estimates, we constructed 95% confidence intervals by considering the error (RMSE) of model predictions for the prior 24 months, starting with January 2009 (two years after the first out-of-sample prediction). Subsequently, following an approach outlined in Poirier et al [35] citing work by Yang et al [36] to capture changing uncertainty at different points in time, we generated the lower and upper limits of the confidence intervals by subtracting or adding the RMSE associated to a moving window of the last 24 errors, respectively, to the upcoming point estimate [35,36]. This confidence interval construction is well-justified as we found that the RMSEs broadly reflected the standard deviation (STD) of the distribution of residuals (predictions—eventually reported or “true” case counts) for each time point (S7 Fig) [35,36], and thus, assuming that our residuals are Gaussian distributed, 95% of residuals should be captured within the mean (the point estimate) plus and minus the STD of residuals. Lastly, we compared our model to a modeling approach detailed by Bastos et al [17]. We chose to compare to the Bastos et al. model given that it 1) presents a Bayesian “chain ladder” alternative to our models that also considers spatial trends and 2) is implemented for a vector-borne disease in a distinct yet neighboring country context in Latin America (Brazil). For simplicity, we used all priors and hyperparameters as defined in their open access code provided in their manuscript [17]. See S1 Diagram for a diagram outlining key steps in this data processing to model implementation process. All data pre-processing, a-spatial and nowcasting analyses were conducted in R version 4.0.3 [37]. All spatial analyses were conducted in ArcMap version 10.6.1 and GeoDa [38,39]. Discussion Our findings highlight the extent to which reporting lags pose a challenge to malaria surveillance in Guyana, with appreciable heterogeneities in the overall magnitude and time trends of delays across areas, and the potential social factors linked to these observed delay patterns. Such trends are likely to be recapitulated in similar settings with fragmented surveillance systems and infrastructural challenges. While improvements to reporting systems, including digitization, infrastructure, and enhanced data management, are critical for malaria control in the medium- and long-term, many national programs will not have the resources to make significant changes in the near future. The nowcasting approach we describe here therefore represents a simple analytical method leveraging data that is often already being collected in order to significantly improve estimation of current malaria cases, at almost no increased investment. Our approach provides an easily accessible tool, which necessitates no additional data requirements and minimal technical training, and can be readily implemented in resource-constrained surveillance contexts. We found convincing evidence for spatial clustering in delay patterns at the level of Neighborhood Democratic Councils, while synchronicities between reporting delays at the level of regions was more moderate. The minimal overlap of regional delay patterns may partially explain the modest improvements when incorporating time trends and predicted case counts in other regions to estimate region-specific case counts as done by the network models. Finally, we observed significant spatial associations between the presence of Amerindian settlements and mines and reporting delays, primarily occurring in regions 1, 7, 8 and 9, with the two overlapping in region 1. This finding suggests that issues in timeliness of reporting are most acutely distributed among socially remote and resource-limited communities. In contrast, associations between precipitation and delays (from 2006–2019) were marginal or non-intuitive (i.e. increased rainfall associated with lower delays). We note that this may be partially attributed to the fact that increased rains affecting driving conditions may undermine reporting timeliness at both a finer spatial scale than regions, or even neighborhood democratic councils, (i.e. with particular roads and towns threatened due to locally-specific infrastructural challenges) and at a finer temporal scale than months. Thus, these patterns are obscured in region- and monthly-level associations. Furthermore, the source of interpolated precipitation data may not fully represent real-world behavior, as “regions with sparse support [are affected by] patchy coverage”, as is the case in several areas within Guyana [26]. Finally, the association between connectivity and delays in 2015 was minimal and non-significant. This finding can be explained by the significant variation in the level of connectivity—both in terms of road access and other forms of travel—observed within regions, which is again diluted in region-level associations. The data imputation models reported relatively low error rates, suggesting that more simple models for each region that simply learn from their own autoregressive trends may be practical and straightforward to implement. While there was considerable variation in DIM model performance across regions, it is promising that the model achieved improvements on underestimated cases in regions 1 and 8, the regions with the lowest fraction of cases reported within a month, and thus have the greatest need for improved real-time surveillance. Furthermore, while the network models for all regions vastly outperformed the corresponding data imputation models over the entirety of the study period, ensemble models which dynamically processes predictions from all models to pinpoint the ideal model may be an easier alternative to report. However, issues related to selecting the error processing period to consider and effectively communicating the mechanics of the ensemble to a broader audience, as previously discussed, should be paid heed. Possible extensions to the network models to markedly improve their predictive power include accounting for other covariates that may better explain trends in reporting rates, such as access to waterways, another important source of travel. Additionally, the NMs may be best implemented using data on the previous distribution of known case counts by NDC (rather than by region), or at a more local spatial unit for which stronger relationships in reporting delays across locations are observed. Finally, to address the presence of negative case predictions, we explored log transforming the target observations, but this methodological approach did not systematically improve the performance of our models. There are a number of additional constraints of our modeling approach that should be addressed in future applications of our work and extensions to other settings. First, our models are not well-suited for uncovering case counts for newly emerging diseases, as they require sufficient historical data to feasibly learn from reporting trends. Additionally, our models do not allow flexible and non-linear relationships between the number of revised cases in a given region in month t and each the relevant predictors, such as potential interactions between the known case counts in month t-2 and month t-3, or between known case counts in month t-4 and precipitation (or other climate-related and other relevant covariates) in month t. In future work, non-parametric methods like regression trees can instead be implemented to avoid these potentially erroneous model assumptions, fitting ensemble methods like random forest regression trees to data on prior known case counts in each region and total precipitation levels to identify those which contribute most significantly to accurate estimates. Such methods may be particularly useful when including multiple additional variables, such as climatological factors or other informative inputs beyond case reporting data. However, it is worth noting that ensemble tree-based approaches may come with the added cost of reduced accessibility, in contrast to more model-based approaches like elastic net regression. Furthermore, such methods require a number of decisions on hyperparameters, such as the number of trees used, tree depth, fraction of input variables considered, among others, in contrast to the sole choice of the lambda and alpha parameters for our elastic net models. It is important to underscore that our approach can only be applied when there is sufficient data for model training and testing, as all models for region 9 failed to run due to sparse data (total number of cases and thus, cases known at each prior time point), limiting its application to areas with a sufficient number of malaria cases. However, despite these challenges, we believe that data imputation and network model approaches, which learn from time and spatial patterns in reporting, can greatly improve the current state of surveillance efforts, particularly for areas where data quality is poorest and remoteness lead to significant reporting delays. We note that due to the moving training window approach, our model is intrinsically able to capture existing reporting trends each year, i.e. scale of underreporting, over time. Our estimates are currently being integrated in the National Malaria Programme in Guyana, which has prioritized surveillance as a key intervention to help in the achievement of malaria elimination. Their use has afforded the opportunity for policymakers to use close-to-accurate estimates to drive planning, resource allocation and implementation of malaria elimination activities. The potential for impact is far-reaching and serves as a suitable tool for use in other countries within the Guyana Shield as well as the Region of the Americas. Acknowledgments This work is dedicated to Sharon Jones-Weekes, who has devoted more than twenty years of service and research to the Guyana Vector Control Services surveillance program. We thank Ayesha Mahmud and Nishant Kishore for their helpful suggestions and support and Marcia Castro for providing guidance during early explorations of some of our spatial analyses. We are also indebted to the entire VCS team in the central office and Bartica District Hospital for all their work in facilitating this data collection effort and for detailing and contextualizing the process of reporting. [END] [1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009570 (C) Plos One. "Accelerating the publication of peer-reviewed science." Licensed under Creative Commons Attribution (CC BY 4.0) URL: https://creativecommons.org/licenses/by/4.0/ via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/