(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org. Licensed under Creative Commons Attribution (CC BY) license. url:https://journals.plos.org/plosone/s/licenses-and-copyright ------------ Ruling out pulmonary embolism across different healthcare settings: A systematic review and individual patient data meta-analysis ['Geert-Jan Geersing', 'Julius Center For Health Sciences', 'Primary Care', 'University Medical Center Utrecht', 'Utrecht University', 'Utrecht', 'The Netherlands', 'Toshihiko Takada', 'Department Of General Medicine', 'Shirakawa Satellite For Teaching And Research'] Date: 2022-02 In this large, comprehensive international study including over 35,000 patients suspected of PE in various healthcare settings, we validated the performance of diagnostic strategies for suspected PE. We observed that the performance of these strategies varied considerably across different healthcare settings, likely due to the difference in case mix and (thus) PE prevalence. Our findings provide strong evidence on the optimum diagnostic strategies for PE suspicion per care setting, balancing the trade-off between missing PE cases and decreasing unnecessary referrals or follow-up. Clinical implications Our interpretation of the findings is as follows. The PERC algorithm is safe in self-referral emergency care, allowing to preclude additional testing for PE (notably including D-dimer) in about 1 in every 5 patients when combined with a low clinical impression of PE being present, which confirms previous findings [27,28]. In the other settings, as this algorithm appears not to be safe, the use of a diagnostic strategy followed by D-dimer testing is preferred. In primary healthcare, strategies with PTP-adjusted D-dimer showed equal safety and higher efficiency than those with a fixed or age-adjusted D-dimer cutoff, making them overall an attractive diagnostic strategy. However, in referred secondary care, strategies with PTP-adjusted D-dimer also had a better efficiency but showed a considerably higher failure rate—ranging between 2.10% and 3.06%—compared to those with age-adjusted D-dimer, which ranged from 0.65% to 0.81%. Finally, in hospitalized or nursing home care, the observed failure rate was higher than that for the other settings, ranging between 1.81% and 5.13%. Moreover, as clearly observed in wide 95% CIs and PIs, the precision of our inferences was not sufficient to draw firm conclusions in this setting. When deciding what diagnostic strategy to use, it should be acknowledged that no diagnostic strategy in patients suspected of PE will be completely safe, i.e., yielding a “failure rate” of 0%. In fact, even CTPA, which is used as the “reference standard” for PE in modern clinical medicine, is not perfectly safe as the cumulative VTE incidence at 3 months after a normal CTPA—i.e., the “failure rate” of CTPA—was reported to be 1.20% (95% CI 0.48 to 2.60) [29]. Accordingly, it could be argued that any diagnostic strategy with a failure rate around 1% to 2% is as safe as referring all patients for CTPA, and this safety threshold is generally considered the adequate standard provided by the ISTH. Nevertheless, this safety threshold is dependent on case mix, exemplified by a higher cumulative VTE incidence at 3 months following a normal CTPA in patients with a high PTP (6.3%; i.e., patients with risk factors such as cancer, previous VTE, and immobilization). Thus, the acceptable threshold of a failure rate could be higher in healthcare settings that include more high-risk patients (i.e., high PE prevalence) than in those including more low-risk patients (i.e., low PE prevalence). Such a prevalence-adjusted threshold of failure rate indeed has been proposed by the ISTH [9]. If this was applied to each healthcare setting in this IPD-MA for illustrative purposes, the acceptable threshold of failure rate should range between 0.71% and 1.86% in self-referral emergency care, between 0.72% and 1.87% in primary healthcare, between 0.78% and 1.93% in referred secondary care, and between 0.80% and 1.95% in hospitalized or nursing home care, respectively. In that case, the optimum strategy (i.e., most efficient strategy with acceptable failure rate) may be the PERC algorithm in emergency care, a PTP-adjusted D-dimer strategy in primary healthcare, and an age-adjusted strategy in referred secondary care, while no strategy showed an acceptable failure rate in hospitalized or nursing home care. Nevertheless, as these prevalence-adjusted thresholds are proposed only for planning diagnostic studies rather than for the use in clinical practice [9], physicians need to set the acceptable threshold of failure rate for their own setting and standards and subsequently choose the optimum diagnostic strategy, likely dictated by clinical context. We believe that our findings can be used to aid that clinical decision-making, balancing the trade-off between safety and efficiency, and tailored to the specific setting and case mix where they work and encounter patients suspected of PE. Furthermore, by combining with various factors (e.g., patient perceptions and demands, availability of imaging studies, and benefit/cost associated with different recommendations) in a clinical setting where it is applied, our findings could be a useful basis for developing a clinical guideline for the diagnosis of PE. This large-scale international study included over 35,000 patients suspected of PE, coming from a variety of healthcare settings. In addition, we used state-of-the-art statistical methods to quantify diagnostic performance of currently available diagnostic strategies. For full appreciation, some aspects of this study though need specific attention. First, the availability of items used in each diagnostic strategy differed across included studies. As such, in the primary analyses, the diagnostic performance of each strategy was compared in different sets of studies. Accordingly, we added the sensitivity analyses for a direct comparison of the diagnostic strategies, which yielded very similar results supporting the robustness of the primary analyses. Second, although we defined the categorization of healthcare settings through profound discussion among expert panel members, it could still be arbitrary. Thus, we analyzed the relationship between failure rate or efficiency and PE prevalence. We found that both failure rate and efficiency became poorer as PE prevalence increased, which supported the robustness of our main finding that the performance of each diagnostic strategy became poorer in healthcare settings with higher PE prevalence. Third, the YEARS algorithm and the Wells rule with PTP-adjusted D-dimer (PeGED) were less safe in this IPD-MA than in their original studies [15,17]. In most of the included studies, the reference standard for PE was a combination of imaging tests and clinical follow-up, with the decision to refer for imaging guided by the diagnostic strategy under evaluation. However, diagnostic strategies adapting D-dimer to PTP, such as YEARS and PeGED, are more efficient than the other strategies. Accordingly, when applying these diagnostic strategies retrospectively in other studies, more patients will have had imaging as the reference standard than clinical follow-up compared to their derivation studies. This approach likely led to the inclusion of small, possibly insignificant clots in the proportion of missed PE cases among those in whom PE could be considered excluded based on a negative PTP-adjusted D-dimer strategy. This hypothesis is supported by data showing that PE detected by the original Wells rule with a fixed D-dimer cutoff included more subsegmental PE than in those detected by the PTP-adjusted YEARS algorithm [30]. Unfortunately, detailed information about the localisation and extent of diagnosed PE was not available in this IPD dataset. Fourth, as shown in Table D in S1 Text, different types of D-dimer assay were used in the included studies, which could be a source of between-study heterogeneity. In addition, the performance of diagnostic strategies in each healthcare setting could be affected by the variation in D-dimer testing (e.g., the skill of laboratory technicians or the timing of the blood test in relation to patient presentation), which we could not explore in this IPD. Finally, the studies included in our IPD-MA were conducted between 2000 and 2019. Over those 20 years, the performance of D-dimer testing and imaging studies has evolved. Hence, although we consider the trends of failure rate and efficiency of the diagnostic strategies in our findings to be valid and representative, the validity of our finding in today’s patients should be interpreted with some caution. [END] [1] Url: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1003905 (C) Plos One. "Accelerating the publication of peer-reviewed science." Licensed under Creative Commons Attribution (CC BY 4.0) URL: https://creativecommons.org/licenses/by/4.0/ via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/