(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . What constrains food webs? A maximum entropy framework for predicting their structure with minimal biases [1] ['Francis Banville', 'Département De Sciences Biologiques', 'Université De Montréal', 'Montreal', 'Quebec', 'Département De Biologie', 'Université De Sherbrooke', 'Sherbrooke', 'Quebec Centre For Biodiversity Science', 'Dominique Gravel'] Date: 2023-10 Regarding the degree distribution of maximum entropy, one aspect that informs us of its ecological realism is the number of isolated species it predicts. As [ 11 ] pointed out, the size of food webs should at least be of S − 1 interactions since a lower number would yield isolated species, i.e. species without any predator or prey. Because non-basal species must eat to survive, isolated species could indicate that other species are missing; otherwise, isolated species should be removed from the network. In S3 Fig , we show that the degree distribution of maximum entropy, given S and L, gives a very low probability that a species will be isolated in its food web (i.e. having k = 0) when L > S − 1. However, under our purely information-theoretic model, the probability that a species is isolated is quite high when the total number of interactions is below S − 1. Moreover, the expected proportion of isolated species rapidly declines by orders of magnitude with increasing numbers of species and interactions. This supports the ecological realism of the degree distribution of maximum entropy derived above. Nevertheless, ecologists wanting to model a system without allowing isolated species could simply change the lower limit of k to 1 in Eq (16) and solve the resulting equation numerically. a) Probability density of KL divergence between in and out-degree sequences of empirical and predicted joint degree sequences. (b) Difference between the KL divergence of empirical and predicted joint degree sequences as a function of connectance. The predicted joint degree sequences were obtained after sampling one realization of the joint degree distribution of maximum entropy for each network while keeping the total number of interactions constant. Another way to evaluate the empirical support of the sampled joint degree sequences is to compare their shape with the ones of empirical food webs. We described the shape of a joint degree sequence by measuring the distance between its in and out-degree sequences (i.e. the distance between its marginal distributions). To do so, we calculated the Kullback–Leibler (KL) divergence [ 48 ] between the in and out-degree sequences of each predicted and empirical distribution. The KL divergence is a measure of relative entropy describing the difference between two distributions. Low values indicate high similarity between the in and out-degree sequences and suggest that the joint degree sequence has a high level of symmetry. We compared the shape of the empirical and predicted joint degree sequences in the left panel of Fig 3 . As expected, our model predicts more similar in-degree and out-degree sequences than empirical data (shown by lower KL divergence values). However, the difference between the KL divergence of predicted and empirical joint degree sequences decreases with connectance (right panel of Fig 3 ). This might be because food webs with a low connectance are harder to predict than food webs with a high connectance. Indeed, in low connectance systems, what makes two species interact may be more important for prediction than in high connectance systems, in which what prevents species from interacting may be more meaningful. This implies that more ecological information may be needed in food webs with a low connectance because more ecological processes determine interactions compared to non-interactions. Therefore, other ecological constraints might be needed to account for the asymmetry of the joint degree distribution, especially for networks with a lower connectance. Nevertheless, our MaxEnt model seems to capture quite well the shape of the joint degree sequence for networks having a high connectance. Examining the difference between predicted and empirical values for each species gives a slightly different perspective (right panel of Fig 2 ). To make that comparison, we must first associate each of our predictions with a specific species in a network. Indeed, our predicted joint degree sequences have the same number of species (elements) as their empirical counterparts, but they are species agnostic. In other words, instead of predicting a pair of values for each species directly (i.e. the number of prey and predators of a given species i), we predicted the entire joint degree sequence without taking into account species’ identity (i.e. the distribution of the number of prey and predators for the entire set of species, without knowing which values belong to which species). The challenge is thus to adequately associate predictions with empirical data. In Fig 2 , we present these differences when species are ordered by their total degree in their respective networks (i.e. by the sum of their in and out-degrees). This means that the species with the highest total degree in its network will be associated with the highest prediction, and so forth. Doing so, we see that species predicted to have a higher number of predators than what is observed generally have a lower number of prey than what is observed (and conversely). This is also shown in S1 Fig , which represents the relationship between prediction errors in the absolute (non-relative) values of k out and k in across networks of varying levels of species richness. This is because the difference in total degree (k out + k in ) between predictions and empirical data is minimized when species are ranked by their total degree (i.e. the average deviation of the sum of relative k out and k in is close to 0 across all species). This result thus shows that the difference between predicted and empirical total degrees is low for most species when ordered by their total degrees. There are no apparent biases towards in or out degrees. In S2 Fig , we show how these differences change when species are instead ordered by their out-degrees (left panel) and in-degrees (right panel), i.e. when minimizing the error in the estimation of the out and in-degrees, respectively. The relative number of predators (k in ) is plotted against the relative number of prey (k out ) for each species in all (a) empirical and (b) predicted joint degree sequences. The predicted joint degree sequences were obtained after sampling one realization of the joint degree distribution of maximum entropy for each network while keeping the total number of interactions constant. (c) Difference between predicted and empirical values when species are ordered according to their total degree. Due to significant data overlap, all relationships are represented as 2D histograms. The color bar indicates the number of species that fall within each bin. We first discuss the predictive capacity of our analytical models. The relationship between the relative numbers of prey k out and predators k in in empirical networks and obtained from the joint degree distributions of maximum entropy is depicted in the left and central panels of Fig 2 , respectively. We observe that our analytical model predicts higher values of generality and vulnerability compared to empirical food webs (i.e. relative values of k out and k in both closer to 1) for many species. In other words, our model predicts that species that have many predators also have more prey than what is observed empirically (and conversely). This is not surprising, given that our model did not include biological factors preventing generalist predators from having many prey. Nevertheless, with the exception of these generalist species, MaxEnt adequately predicts that most species have low generality and vulnerability values. Heuristic maximum entropy models In this section, we explore the predictions of our heuristic models. Overall, we found that the models based on the joint degree sequence (i.e. the type II null and heuristic MaxEnt models) reproduced the structure of empirical food webs much better than the ones based on connectance (i.e. the type I null and heuristic MaxEnt models, Table 1). This suggests that the predictive capacity of connectance might be more limited than what was previously suggested [10]. On the other hand, the neutral model of relative abundances was surprisingly good at predicting the maximum trophic level and the network diameter (Table 2). However, with the exception of the network diameter, the type II heuristic MaxEnt model was better at predicting network structure than the neutral model for most measures considered. This might be because, although neutral processes are important, they act in concert with niche processes in determining species interactions [49–52]. The joint degree sequence captures information on both neutral and niche processes because the number of prey and predators a species has is determined by its relative abundance and biological traits. These results thus show that having information on the number of prey and predators for each species substantially improves the prediction of food-web structure, both compared to models solely based on connectance and to the ones solely based on species relative abundances. Next, the predictions of the type II heuristic MaxEnt model can be compared to its null model counterpart. On average, the type II heuristic MaxEnt model was better at predicting nestedness (0.62 ± 0.08) than its corresponding null model (0.73 ± 0.05; empirical networks: 0.63 ± 0.09) for networks in our complete dataset (Table 1). This might in part be due to the fact that nestedness was calculated using the spectral radius of the adjacency matrix, which directly leverages information on the network itself just like the heuristic MaxEnt model. The proportion of self-loops (cannibal species) was also better predicted by the type II heuristic MaxEnt model in comparison to the type II null model. However, the type II null model was better at predicting network diameter and average maximum similarity between species pairs, and predictions of the maximum trophic level and the proportion of omnivorous species were similar between both types of models. We believe that this is because increasing the complexity of a food web might increase its average and maximum food-chain lengths. In comparison, the null model was more stochastic and does not necessarily produce more complex food webs with longer food-chain lengths. Moreover, we found that the entropy of empirical food webs was slightly lower than their maximum entropy when constrained by their joint degree sequence (S4 Fig). Empirical food webs had an SVD entropy of 0.89 ± 0.04, compared to an SVD entropy of 0.94 ± 0.03 for networks generated using the type II heuristic MaxEnt model. The relationship between the SVD entropy of empirical food webs and their maximum entropy is plotted in the last panel of Fig 4. The slight increase in entropy confirms that our method generated more complex networks. Even though we found that many measures of empirical networks are close to the ones of their maximum entropy configuration, the relatively low predictability of entropy itself may be indicative of additional constraints shaping food-web structure, especially for networks with low SVD entropy. Incorporating more constraints into the model could increase its capacity to generate networks with an adequate level of complexity, as shown by the decrease in predictive errors of entropy of the type II heuristic MaxEnt model compared to the one based on connectance (Table 1). Additionally, we found no clear relationship between the increase in SVD entropy and the number of species, the number of interactions, and connectance (S5 Fig). This suggests that our model captured the complexity of small and large networks on a similar level and that its capacity to reproduce food-web structure was unrelated to the order and size of the network. In other words, the gap in entropy between empirical food webs and their maximum entropy configuration may be the result of additional constraints that were not taken into account in the model, regardless of the number of species and the number of interactions. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 4. Relationship between the structure of empirical and maximum entropy food webs. Maximum entropy networks were obtained using the type II heuristic MaxEnt model based on the joint degree sequence. (a) Nestedness (estimated using the spectral radius of the adjacency matrix), (b) the maximum trophic level, (c) the network diameter, and (d) the SVD entropy were measured on these empirical and maximum entropy food webs. The identity line is plotted in each panel. https://doi.org/10.1371/journal.pcbi.1011458.g004 A direct comparison of the structure of maximum entropy food webs constrained by the joint degree sequence with empirical data also supports the results depicted in Table 1. In Fig 4, we show how well empirical measures are predicted by the type II heuristic MaxEnt model. Following our previous results, we found that nestedness was very well predicted by our model. However, the model overestimated the maximum trophic level and network diameter, especially when the sampled food web had intermediate values of these measures. In S6 Fig, we show that the pairwise relationships between the four measures in Fig 4 and species richness in empirical food webs are similar (in magnitude and sign) to the ones found in food webs generated using the type II heuristic MaxEnt model. This indicates that the number of species in the network does not seem to impact the ability of the model to reproduce food-web structure. Notwithstanding its difficulties in reproducing adequate measures of food-chain lengths, the type II heuristic MaxEnt model can predict surprisingly well the proportions of three-species motifs in empirical food webs. Motifs have been shown to be the backbone of complex ecological networks on which network structure is built and play a crucial role in community dynamics and assembly [53]. Differences in motif profiles between an observed food web and null model-generated ones can unveil important ecological mechanisms that contribute to network structure [46]. In Fig 5, we show that the motif profile of networks generated using the type II heuristic MaxEnt model accurately reproduced the one of empirical data. This model made significantly better predictions than the ones based on connectance and the type II null model based on the joint degree sequence. This is also shown in Fig 6, which reveals that the relationships between the proportions of single-link motifs in empirical food webs are similar to the ones in networks generated using the type II heuristic MaxEnt model. This is in contrast with the type I null and MaxEnt models based on connectance, which produced opposite relationships than what was observed empirically. Our findings show that generating the most complex food web constrained by the joint degree sequence using maximum entropy does not alter the proportions of three-species motifs on the whole. This suggests that motif profiles may simply be a statistical attribute of food webs driven by the joint degree sequence. However, given the incapacity of our MaxEnt models to accurately predict food-chain lengths, the way motifs interconnect with each other may hold greater biological significance than the proportion of motifs itself. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 5. Proportions of single-link three-species motifs in empirical and predicted food webs. S1: Tri-trophic chain (a top predator feeds on a meso-predator which feeds on a basal prey). S2: Omnivory (a top predator feeds on a meso-predator and a basal prey). S3: Tri-trophic feeding loop (a cyclic three-species predator-prey system). S4: Apparent competition (a predator feeds on two prey). S5: Exploitative competition (two predators feed on the same prey). Null 1: Type I null model based on connectance. MaxEnt 1: Type I heuristic MaxEnt model based on connectance. Null 2: Type II null model based on the joint degree sequence. MaxEnt 2: Type II heuristic MaxEnt model based on the joint degree sequence. Boxplots display the median proportion of each motif in food webs (middle horizontal lines), as well as the first (bottom horizontal lines) and third (top horizontal lines) quartiles. Vertical lines encompass all data points that fall within 1.5 times the interquartile range from both quartiles, and dots are data points that fall outside this range. Only the single-link motifs S1-S5 are shown given the scarcity of double-link motifs in most empirical and predicted networks. https://doi.org/10.1371/journal.pcbi.1011458.g005 PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 6. Pairwise relationships between the proportions of single-link three-species motifs in empirical and predicted food webs. S1: Tri-trophic chain. S2: Omnivory. S4: Apparent competition. S5: Exploitative competition. Null 1: Type I null model based on connectance. MaxEnt 1: Type I heuristic MaxEnt model based on connectance. Null 2: Type II null model based on the joint degree sequence. MaxEnt 2: Type II heuristic MaxEnt model based on the joint degree sequence. Regression lines are plotted in each panel. Motif S3 is not shown because of its low proportion in most empirical and predicted networks. https://doi.org/10.1371/journal.pcbi.1011458.g006 One of the challenges in implementing and validating a maximum entropy model is to discover where its predictions break down. The results depicted in Table 1 and Fig 4 show that our type II heuristic MaxEnt model can capture many high-level properties of food webs, but does a poor job of capturing others. This suggests that, although the joint degree sequence is an important driver of food-web structure, other ecological constraints might be needed to account for some emerging food-web properties, especially the ones regarding food-chain lengths. Nevertheless, Figs 5 and 6 show that this model can reproduce surprisingly well motif profiles, one of the most ecologically informative properties of food webs. This suggests that the emerging structure of food webs is mainly driven by their joint degree sequence, although higher-level properties might need to be included in the model to ensure that food-chain lengths fall within realistic values. [END] --- [1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011458 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/