(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Automated assessment reveals that the extinction risk of reptiles is widely underestimated across space and phylogeny [1] ['Gabriel Henrique De Oliveira Caetano', 'Jacob Blaustein Center For Scientific Cooperation', 'The Jacob Blaustein Institutes For Desert Research', 'Ben-Gurion University Of The Negev', 'Midreshet Ben-Gurion', 'Mitrani Department Of Desert Ecology', 'David G. Chapple', 'School Of Biological Sciences', 'Monash University', 'Clayton'] Date: 2022-07 The Red List of Threatened Species, published by the International Union for Conservation of Nature (IUCN), is a crucial tool for conservation decision-making. However, despite substantial effort, numerous species remain unassessed or have insufficient data available to be assigned a Red List extinction risk category. Moreover, the Red Listing process is subject to various sources of uncertainty and bias. The development of robust automated assessment methods could serve as an efficient and highly useful tool to accelerate the assessment process and offer provisional assessments. Here, we aimed to (1) present a machine learning–based automated extinction risk assessment method that can be used on less known species; (2) offer provisional assessments for all reptiles—the only major tetrapod group without a comprehensive Red List assessment; and ( 3) evaluate potential effects of human decision biases on the outcome of assessments. We use the method presented here to assess 4,369 reptile species that are currently unassessed or classified as Data Deficient by the IUCN. The models used in our predictions were 90% accurate in classifying species as threatened/nonthreatened, and 84% accurate in predicting specific extinction risk categories. Unassessed and Data Deficient reptiles were considerably more likely to be threatened than assessed species, adding to mounting evidence that these species warrant more conservation attention. The overall proportion of threatened species greatly increased when we included our provisional assessments. Assessor identities strongly affected prediction outcomes, suggesting that assessor effects need to be carefully considered in extinction risk assessments. Regions and taxa we identified as likely to be more threatened should be given increased attention in new assessments and conservation planning. Lastly, the method we present here can be easily implemented to help bridge the assessment gap for other less known taxa. Funding: This work has been funded by the Israel Science Foundation grant Num. 406/19 to SM & UR ( https://www.isf.org.il/ ). This work has been funded by the German-Israeli Foundation for Scientific Research and Development Num. I-2519-119.4/2019 to UR ( https://www.gif.org.il/ ). It has also been partially funded by Australian Research Council grant num. FT200100108 to DGC ( https://www.arc.gov.au/ ). We also thank the Australian Friends of Tel Aviv University–Monash University (‘AFTAM’) Academic Collaborative Awards Program for funding this research to SM & DGC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Here, we use robust machine learning to automatically predict IUCN extinction risk categories to all reptile species globally, to (1) present a new automated assessment framework and (2) provisionally fill the reptile assessment gap. Our methods rely only on readily available data (mostly geographic ranges, phylogenetic structure, and body mass) and estimate potential effects of assessor or reviewer identities. We use these methods to assign provisional extinction risk categories to 4,369 reptile species, of which 3,286 are currently unassessed and 1,083 are currently classified as DD. We further explore global trends in extinction risk across all reptiles and highlight the effects of our new provisional categories on overall patterns in this class. Lastly, we highlight potential sources of biases and incongruences in the assessment process. Reptiles remain the only tetrapod group without comprehensive IUCN assessment. As of July 2021, approximately 28% of 11,570 reptile species remain unassessed and approximately 14% of those assessed have been classified as DD [ 1 ] Moreover, many of the reptile assessments are more than 10 years old rendering them outdated as per IUCN guidelines [ 1 ]. This assessment gap is not random. Smaller species, with narrow distributions, located in the tropics, are less likely to have been assessed [ 9 ]. Bland and Böhm [ 28 ], and Miles [ 19 ], automatically assessed some reptile species. Their models predicted approximately 20% of NE and DD species are threatened, a similar proportion to those assessed as such (excluding DD). However, in both studies, models were trained and validated using a small set of species with a wealth of morphological, ecological, and life history data (which are rare for DD species). Such exercises might provide important information on the mechanisms underlying extinction risk. However, these data-hungry methods are greatly limited in their utility because such data are unavailable for the vast majority of DD and NE species (e.g., DD and newly described reptiles, most invertebrate taxa). Ultimately, we need methods that will enable precise automated extinction risk assessments of species, which acknowledge different biases and data gaps. A challenge that remains unaddressed in automated assessment is human decision bias. Biases are introduced by ambiguities in the interpretation of IUCN guidelines by assessors and reviewers, heterogeneity in assessor expertise levels, and personal agendas [ 26 ]. The IUCN tries to decrease reliance on subjective expert opinions [ 2 ], even employing automated assistance for generating and verifying assessments [ 12 ]. However, expert input (and guidance from the IUCN personnel who lead each workshop) remains an important part of the assessment process. Automated methods that ignore such biases in their training data risk reproducing or even amplifying them in their predictions [ 27 ]. The Red List assigns evaluated species to categories based on their distribution, population trends, and specific threats [ 7 ]. The categories Least Concern (LC) and Near Threatened (NT) are deemed not threatened, while Vulnerable (VU), Endangered (EN), and Critically Endangered (CR) species are deemed threatened. Other species are assessed as Extinct in the Wild (EW), Extinct (EX), or Data Deficient (DD). DD category is assigned to species for which information is insufficient to assign them any of the above categories. Still, most of global biodiversity remains Not Evaluated (NE) by the Red List. This is predominantly due to the laborious nature of Red List assessments, which are based on voluntary expert participation, usually through multiparticipant in-person meetings [ 7 ]. Importantly, NE and DD species are generally not prioritized for conservation decision-making, although Red List guidelines specifically state that they “should not be treated as if they were not threatened” [ 7 ]. Even though DD species have been shown to be comparable to CR ones with respect to their levels of overlap with human impact [ 8 ]. These assessment gaps [ 9 , 10 ] led to the use of several automated methods to provisionally assess species [ 11 , 12 ]. These methods employ algorithms including phylogenetic regression models [ 13 – 15 ], structural equation models [ 16 ], random forests [ 17 , 18 ], deep learning [ 19 , 20 ], Bayesian networks [ 21 , 22 ], and even linguistic analysis of Wikipedia pages [ 23 ]. Most previous attempts (e.g., [ 13 , 17 , 18 ]) employed a binary classification of threatened (categories CR, EN, and VU) versus nonthreatened (NT and LC). Few studies attempted to predict specific categories (e.g., [ 19 , 20 , 24 ]), which are more useful to decision makers as they enable prioritizing among threatened species. A more comprehensive review of these methods [ 25 ] also calls for attention to obstacles for their implementation in the assessment process. This review argues that a major obstacle for their implementation is the lack of communication between conservation researchers developing such methods and IUCN personnel [ 25 ]. The International Union for Conservation of Nature’s (IUCN) Red List of Threatened Species [ 1 ] is the most comprehensive assessment of the extinction risk of species worldwide [ 2 ]. Since its inception in 1964, the Red List has been instrumental in “generating scientific knowledge, raising awareness among stakeholders, designating priority conservation sites, allocating funding and resources, influencing development of legislation and policy, and guiding targeted conservation action” [ 3 ]. For example, the 2004 completion of IUCN’s Global Amphibian Assessment reported their dire global state [ 4 ] and led to the creation of organizations dedicated to amphibian conservation and to increased funding for research and conservation policy focused on amphibians [ 3 ]. Additionally, the IUCN’s Red List forms a basis for the designation of priority areas for conservation, such as Key Biodiversity Areas [ 5 ]. For example, the Alliance for Zero Extinction [ 6 ] works directly with decision-makers to establish protected areas for threatened species represented by a single population, using Red List data. Analysis includes only species that have IUCN assessments (6,520 species). (a) Proportion of reptile species assigned to each extinction risk category for the actual IUCN assessments (Observed); proportion expected if the most optimistic group of assessors assessed every species (Optimistic); proportion expected if the most pessimistic group assessed every species (Pessimistic). (b) Proportion of threatened species in each biogeographical realm for Observed, Optimistic, and Pessimistic assessments. Significant differences in a Pearson’s χ 2 test are indicated by asterisks, colored according to which proportions are being compared ( S11 Table ). The data underlying this figure can be found in S2 Data . AA, Australasian; AT, Afrotropical; CR, Critically Endangered; EN, Endangered; IM, Indomalayan; LC, Least Concern; MA, Madagascan; NA, Nearctic; NT, Near Threatened; NT, Neotropical; OC, Oceanian; PA, Palearctic; VU, Vulnerable. We permuted the identity of assessors and reviewers until we identified the group of assessors and reviewers that would assign each species to the least threatened category possible, while maintaining the other predictors’ values (optimistic scenario) and to the most threatened category possible (pessimistic scenario). Proportions of species predicted as threatened increased from optimistic to observed to pessimistic scenarios for all categories ( Fig 4A , S11 Table ) and across most biogeographical realms. In the Nearctic and Madagascar, the observed and pessimistic scenarios were similar, and in Oceania no differences were detected ( Fig 4B , S12 Table ). Species that changed category between the observed assessments and the optimistic scenario moved overwhelmingly to a single category (LC), while in the pessimistic scenario, species showed a more diverse distribution of new categories ( S3 Fig ). The spatial data are grouped by WWF terrestrial ecoregions. The shift between red and blue is proportional to the (symmetric log scale) increase/decrease in extinction risk per ecoregion when using our assessments. Bar plots indicate proportion of species in threatened categories for each biogeographical realm, before and after the inclusion of automated assessments. The data underlying this figure can be found in S2 Data . IUCN, International Union for Conservation of Nature; WWF, World Wide Fund for Nature. Colors in internal nodes represent the difference in percentages for all descendant tips. Trees by Tonini and colleagues [ 31 ] (Squamata) and Colston and colleagues [ 32 ] (Archelosauria). The shift between red and blue is proportional to the (symmetric log scale) increase/decrease in extinction risk per branch when using our assessments. Branch widths are proportional to log species richness in each clade. Proportion of threatened species for each family, before and after inclusion of automated assessments are detailed in S9 Table . The data underlying this figure can be found in S2 Data . DD, Data Deficient; NE, Not Evaluated. The proportion of threatened species increased overall for Squamata and Crocodylia, but decreased for Testudines ( Fig 2 , S9 Table ), especially in the turtle families Chelidae, Chelydridae, and Kinosternidae. Anguimorph lizards (except Varanidae) proportion of threatened species decreased following our predictions. The 3 largest lizard clades—Iguania, Scincomorpha, and Gekkota—(as well as Lacertoidea except Lacertidae) showed increased threat, as did the largest snake clades (Colubridae, Dipsadinae, Elapidae) and Serpentes as a whole ( Fig 2 , S9 Table ). Including predictions for DD and NE species, the proportions of threatened species increased in ecoregions across most of South and North America, Australia, and Madagascar ( Fig 3 , S10 Table ). (A) Grouping categories into threatened and nonthreatened and (B) specific extinction risk categories: CR, Critically Endangered; EN, Endangered; LC, Least Concern; NT, Near Threatened; VU, Vulnerable. Number of species in each category is indicated above each bar. Significant differences in a Pearson’s χ 2 test are indicated by asterisks, colored according to which proportions are being compared ( S7 Table ). The data underlying this figure can be found in S2 Data . DD and NE species were significantly more likely to be assigned threatened categories than assessed species (DD: 29%, NE: 26%, assessed non-DD: 21% threatened; Fig 1A , S7 Table ). DD species were more likely than assessed species to be predicted as VU, EN, or CR, and less likely to be predicted as NT or LC. NE species were more likely than assessed species to be VU, and EN, and less likely to be predicted as NT or LC ( Fig 1B , S7 and S8 Tables). We compared our method to similar past endeavors. Our simplest model (“Environment and body mass”; Table 1 ) obtained higher accuracy (88%) than methods based on Random Forest (85%) and Neural Networks (79%), using the same predictors ( S5 Table ). The extreme class imbalance in the dataset greatly hindered both methods, especially Neural Networks ( S5 Table ), despite the use of supersampling to account for uneven class distributions. In fact, Neural Networks are known to be sensitive to such imbalances [ 30 ], while XGBoost is considered more robust to them [ 29 ]. While previous methods have incorporated similar predictors to ours, and have separately incorporated features such as tolerating missing values, identifying specific IUCN categories, and accounting for spatial and phylogenetic autocorrelation, none did so in combination, as our method did ( S6 Table ). Our method is also the first to account for assessor bias (as an exploratory tool, not for prediction; S6 Table ). Criterion B for IUCN extinction risk assessments—which is predominantly based on species range sizes [ 7 ]—is the most widely used criterion for assigning a threatened status in reptile assessments (74% of species assessed under any criteria). The model only trained on species assessed as threatened based on criteria B, as well as NT and LC species, was more accurate for both binary (93%, AUC: 0.84, Table 1 ) and specific categorizations (87%, AUC: 0.80, Table 1 ). Further, excluding assessor/reviewer effects resulted in similar accuracy (binary classification: 92% accuracy, 0.80 AUC; specific classification: 86% accuracy, 0.78 AUC; Table 1 ). Despite their higher accuracy, these models tended to misclassify non-criterion B–threatened species, assigning them to lower extinction risk categories than observed ( S4 Table ). This is probably because species are only classified under non-B criteria if such criteria assign them to a similar, or higher, extinction risk category. Thus, we proceeded with models trained on all species for the remaining analyses. Our model correctly classified 93.8% of previously assessed species (6,112 of 6,520 species). The 6.2% misclassified species (408 of 6,520 species) were nearly twice as likely to be assigned to nonthreatened categories than to shift in the opposite direction and generally to shift to less threatened specific categories ( S2 Fig ). This was consistent in most biogeographical realms, except in the Nearctic and Neotropical realms, in which the numbers were similar for the binary classification ( S2 Fig ). Across different classification tasks and extent of occurrence classes, the average ranking of the importance of feature classes in the complete model was predominantly due to (1) spatial autocorrelation; (2) assessor effects; (3) phylogenetic autocorrelation; (4) climate; and (5) human encroachment. In the model excluding assessor/reviewer effects, the ranking was: (1) spatial autocorrelation; (2) phylogenetic autocorrelation; (3) climate; (4) human encroachment; and (5) insularity (for full details on feature importance across models, see S1 Fig and S2 Table ; for a list of variables in each category, see S1 Data ). The hyperparameter configuration for the model chosen for predictions is summarized in S3 Table . The features selected for each combination of range size (calculated as extent of occurrence) class and classification task are provided in S1 Data . The contribution of each feature class to predictive performance for each combination of range size class and classification task is presented in S1 Fig . The model we used to predict extinction risk for DD and NE species included spatial and phylogenetic autocorrelation and excluded assessor/reviewer effects, achieved 90% validated accuracy for the binary threatened/nonthreatened classification, and 84% accuracy for predicting specific categories (AUC - Area Under Curve: 0.83, Tables 1 and 2 ). The complete model, including spatial and phylogenetic autocorrelation, and assessor/reviewer effects, achieved similar results, as did the model excluding spatial and phylogenetic autocorrelation but retaining assessor/reviewer effects ( Table 1 ). The model excluding both autocorrelations and assessor/reviewer effects, and the models including either spatial or phylogenetic autocorrelation, were less accurate ( Table 1 ). However, the model obtained the highest accuracies when excluding threatened species classified under criteria other than B from the training dataset ( Table 1 ; details below). We predicted extinction risk categories for DD and NE species using the model that excluded assessor/reviewer effects but retained spatial and phylogenetic data, since we cannot know the identity of assessors who will evaluate currently unassessed species. For analyses regarding potential assessor/reviewer effects, we used the complete model. Detailed accuracy metrics are presented in Table 2 . The lowest accuracy across models was in separating the NT and LC categories ( Table 2 ). We implemented a novel automated assessment method, using the XGBoost algorithm [ 29 ], and provided provisional assessment to 4,369 reptile species that were previously NE or assessed as DD ( S1 Data ). Of these 4,369 species, we assessed 1,161 (27%) as threatened (244 as CR, 467 as EN, and 450 as VU), and 3,208 as non-threatened (3,021 as LC and 187 as NT). This is compared to 21% threatened species in the assessed/training dataset (1,375 of 6,520, χ 2 : 26.947, p-value: <0.001). Discussion Our model assigned IUCN extinction risk categories to the 40% of the world’s reptiles that currently lack published assessments or are classified as DD. Our novel modeling approach enabled classifying specific extinction risk categories with high accuracy using only readily available data (ranges and body sizes). Our methods also gained better accuracy than previously explored methods (S5 Table). We predicted that the prevalence of threatened reptile species is significantly higher than currently depicted by IUCN assessments. This pattern is widespread across space and phylogeny. Our results show that, while high prediction accuracy can be achieved without explicitly accounting for assessor/reviewer identities, the identity of assessor/reviewers greatly affects predictions. General model results The classification accuracy of more extreme categories (CR, EN, and LC) was higher than categories straddling the threatened/nonthreatened threshold (VU and NT; S1 Table). This likely reflects ambiguities inherent to the assessment of borderline cases, while extreme cases are easier to identify. This is compounded in the category it proved hardest to predict (NT), as there are no distinct quantitative thresholds for NT as there are for threatened categories (although guidance is given by the IUCN on how NT should be assessed [7]). Such thresholds are a primary factor for assigning criterion B extinction risk designations (and for our modeling). Misclassifications of assessed species tended toward less threatened categories (S2 Fig) indicating that our predictions of unassessed species may actually be more optimistic than the true state of extinction risk for reptiles. Machine learning methods, such as XGBoost, are geared primarily toward prediction not inference [33]. Any ecological interpretation of feature importance should thus be taken with caution. The greater importance of spatial and phylogenetic eigenvectors in our classification tasks (S1 Fig, S2 Table) is most likely due to the greater number of features included in these categories. Nevertheless, this shows that extinction risk has highly predictable spatial and phylogenetic patterns, i.e., that some regions and some taxa are more prone to extinction than others. This can be used to approximate the conservation status of less studied taxa, for which no other information is available. The climatic and human encroachment variables obtained high importance scores. A previous meta-analysis found widespread negative effects of human land modification on reptile abundance but no effect of climate [34]. This discrepancy could be due to climate acting as proxy for other highly spatially autocorrelated factors. Insularity was also important in many of the classification tasks in agreement with previous studies that identified it as a major contributor to extinction vulnerability in reptiles [35]. Range size, another major correlate of extinction risk, did not rank high in our models, likely due to it already being used as an a priori criterion to separate species before training models. Future studies should expand on the mechanisms underlying the spatial and phylogenetic patterns in extinction risk identified in this study. Nine species classified as CR by IUCN were considered LC by our model. Some of these have fragmented ranges (Spondylurus lineolatus, Liolaemus azarai, and Emoia slevini), which might have caused our model to underestimate their extinction risk. Our models used extent of occurrence as a proxy of range size, which can greatly differ from area of occupancy in species with fragmented ranges. Thus, species evaluated under area of occupancy criteria might be harder to capture in our model. Small and fragmented ranges can also be more unstable, which might result in discrepancies between the datasets used to train the model. GARD range data represents historical ranges, including parts of the range from which populations may have been extirpated. This might cause some of the discrepancies observed. For example, the GARD database includes range fragments of S. lineolatus that are classified as possibly extinct in the IUCN database. Other species classified as less threatened by the model suffer from threats such as invasive species (Liolaemus paulinae and Cyrtodactylus jarakensis), quarrying (Homonota taragui and Cyrtodactylus guakanthanensis), tourism (Calamaria ingeri), and fires (Bellatorias obiri), which are not accounted for in our modeling. Although some of the human encroachment features included might act as proxies for such threats, some local stressors will escape this approximation. Four species (Tropidophis xanthogaster, Cubatyphlops perimychus, Celestus marcanoi, and Chioninia spinalis) were classified as LC by IUCN, but as CR by our model. All are small ranged species located in protected areas. Protected area effects, and local population dynamics may not have been captured by our model in rare cases, leading to occasional overestimation of threat. Alternatively, actual assessments may have been inconsistent with most of the Red List. These are poorly known species, their IUCN assessments read: “while threats have been identified, these are presently localized” (T. xanthogaster); “the limited information available indicates that it is able to adapt at least to certain forms of disturbance” (C. perimychus); “there is no information about its population… Further research into its distribution, abundance, and population trends should be carried out to have more knowledge about how the threats are impacting the species” (C. marcanoi). This lack of information opens room for the introduction of biases, such as overly optimistic assessors overlooking important threats. All 4 species classified as LC by IUCN and CR by our model have extremely restricted ranges and are endemic to islands with high proportion of threatened species. Thus, we suggest these species may be more threatened than currently depicted in the Red List and would benefit from reassessment. Similar attention should be given to all species that moved to a more threatened category in our assessment (S1 Data). We recommend a strong precautionary approach in translating such disparities into conservation action. Other than differences in range sizes between GARD and IUCN datasets, misclassifications of species as less threatened than assessed by the IUCN may be due to species meeting Red List criteria other than B, as their exclusion led to higher model accuracy. These criteria are mostly based on data on population sizes and trends, which are unavailable for most reptile species. Population dynamics are difficult to approximate using remotely sensed predictors [36] such as the ones used in most automated assessment methods. Excluding species classified as threatened under non-B criteria from model training caused their extinction risk to be severely underestimated (S4 Table). This highlights that the inclusion of population size and trend data in the model can only increase the level of predicted extinction risk compared to the result expected under criterion B only, mimicking the IUCN assessment process. Nevertheless, most of our modeled classifications (for assessed species) are the same as the IUCN ones (94%, 6,112 of 6,520). The modeled assessments we obtained can be used to identify priorities for assessment of NE species, with species estimated to be at higher risk requiring more urgent assessment. Likewise, previously assessed species, which our method identified as being at higher extinction risk than their current IUCN category indicates, should be priority candidates for reassessment [25], especially in the case of species previously categorized as DD, as their current assessment does not allow their prioritization in conservation efforts. A major obstacle for the implementation of correlative automated assessment methods, such as the one we present, is the lack of explicit parameters to justify the assessment under existing criteria [25]. To overcome this obstacle, we propose the IUCN consider the creation of a parallel listing for automated assessments, to be displayed alongside IUCN assessments with clear indication of the provisional, modeled, status of the assessment. We recognize that the creation of this new feature is not a simple endeavor but suggest it could be highly beneficial for the IUCN Red List. As automated methods become more easily available and precise, they offer an opportunity that should not be ignored for advancing the conservation of neglected (or newly described [37]) taxa and regions. Moreover, our provisional assessments and method can be used in regional red lists, which have more flexible guidelines. We applied our methods to all DD and NE reptiles globally. In practice, our method can also be applied to regional- and country-level assessments. This is the scale at which national red lists, which support many country-level conservation decisions, are made [38]. Nevertheless, in some regions, challenges, such as lack of resources or standardized methods for regional assessments, are especially salient [39]. Provisional assessments provided by automated methods such as ours can also be used to inform conservation policy and action on DD and NE species, which are currently often given little weight, if any. We recommend that the use of these provisional categories in conservation will be aligned with expert input, especially for species in borderline categories (VU and NT), for which the automated assessment was less reliable. Predictions for data deficient and not evaluated species Our results suggest DD species are more likely to be threatened than categorized species, adding to growing evidence in that regard [8,14,17,40–42], but unlike previous automated assessments for reptiles [19,28]. However, it is important to note that previous assessments have drawn on different datasets, both with respect to predictors used and level of extinction risk, as range maps and extinction risk categories have since been updated. We further found that NE reptiles (similar to DD species) are more likely to be threatened than categorized species—supporting the urgency of previous calls for a comprehensive reptile assessment [9]. Our method relies on extent of occurrence maps, which were used as a hierarchical classifier in modeling. Non-DD-assessed species have an extent of occurrence that is 16% larger, on average, than DD and NE species (F-value: 6.93, p-value: 0.009). For NE species this may be caused by them being recently described (i.e., later than a workshop on the fauna of the area they inhabit was conducted) and thus having small extent of occurrence. Taxonomic revision resulting in species splits will also give rise to NE species with small extents of occurrence. With such alarmingly high levels of predicted threat, we recommend that decision-makers take a cautious stance and assign DD and NE species similar priority as threatened species, unless evidence to the contrary is available (e.g., having been assigned a nonthreatened category by an automated assessment). DD species may have incomplete distribution records or suffer from taxonomic uncertainties (although only 69 of the 1,083 DD species examined here were classified as such due to taxonomic uncertainty), which might cause their ranges to be underestimated. On the other hand, many truly rare and small-ranged species lack information to be assigned an extinction risk category. It is useful to provide DD species with provisional assessments because they often cannot be included in conservation prioritization [42]. Thus, it is safer to assume that DD species indeed have the ranges from which they are presently known, rather than risking leaving very threatened species in an unprioritizable category [8]. Phylogenetic and spatial patterns Our results revealed an overall decrease in the proportion of threatened turtle species after the addition of our predictions for DD and NE species (Fig 2). This could be due to the more complete assessment of turtles than of squamates. Data on population sizes and trends are much more readily available for testudines than for squamates [43]. Only 19% of squamates were classified as threatened based (at least in part) on criteria other than B—compared to 83% of turtles. The proportion of threatened species tended to increase in some squamate groups, especially in small, fossorial, rare, and endemic taxa (Fig 2, S9 Table), which is consistent with previously reported patterns of data deficiency [9], or possibly caused by underestimation of their ranges. Our method is thus better suited for data-poor clades than for extremely data-rich ones. The latter have already been assessed or are easy to assess, but the former comprise most of global biodiversity. Thus, our method could be especially useful for other data-poor and underassessed groups, such as most invertebrate clades. Our results suggest that the world’s unknown and rich biodiversity is at even greater risk than previously perceived. This finding adds to accumulating evidence that geographical and phylogenetic patterns of extinction risk and knowledge gaps are mostly congruent [10]. We further found that the proportion of threatened species increases in most ecoregions in the Americas, Australia, and Madagascar but decreases in most of Africa and Eurasia. This could be driven by a taxonomic effect, as many of the families predicted to increase in proportion of threatened species are especially diverse in the Americas, Australia, and Madagascar (e.g., Dactyloidae, Diplodactylidae, Dipsadidae, Elapidae, Phrynosomatidae, and Scincidae; Fig 2). Assessments of regions and taxa we identified as likely to be more threatened should be given increased attention in new assessments and conservation planning. [END] --- [1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001544 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/