TITLE: Using R to locate spatial data points inside map polygons DATE: 2017-09-27 AUTHOR: John L. Godlee ==================================================================== I was looking into a paper called "Determinants of woody cover in African savannas" by Sankaran et al. (2005). The paper looks at the large scale environmental factors that affect percentage woodland cover in African savanna landscapes. One figure in particular that got me interested was this one: ![Mean annual Precipitation vs. woody cover](https://johngodlee.xyz/img_full/sankaran/map_wood.png) I haven't fully worked out the implications of this figure yet, but what stands out to me the most is that many plots in high rainfall areas with low woody cover are classed as 'arid fertile savanna' by White's veg. classification. Secondly moist infertile savanna seems to straddle the saturation point of the MAP limited woody cover. It shows that savannas with Mean Annual Precipitation values less than ~650 mm have their upper woody cover potential limited by precipitation, but above that threshold an increase in MAP doesn't increase the maximum potential woody cover. It also shows that lots of sites have woody cover below their MAP limited maximum, pointing to lots of other environmental factors, like fire, herbivory, soil characteristics. Looking into this graph more, I wanted to see whether there was any biogeographic patterns that could be drawn from the data. Were all the sites with particularly low actual woody cover from a particular woodland cover biome, for example. To do this I compared the data from Sankaran et al. (2005), which is publicly available as supplementary information, to White's seminal vegetation classification map of 1983, which I accessed as as shapefiles from here. I did this analysis in R, so all the code below is for R. You can find the code and data that I used by cloning this github repo. First, I loaded the packages and the data: # Set working directory to the location of the source file ---- setwd(dirname(rstudioapi::getActiveDocumentContext()$path)) # Packages ---- library(ggplot2) library(dplyr) library(rgeos) library(rgdal) # Import data ---- ## sankaran data cover <- read.csv("data/sankaran_2005_data.csv") str(cover) ## white 1983 veg data white_veg <- readOGR(dsn="data/whitesveg", layer="Whites vegetation") ## Country outline countries <- readOGR(dsn="data/africa", layer="Africa") First I can have a go at plotting White's map: # Plot White's veg map data ---- ## Fortify country outline for ggplot countries@data countries_fort <- fortify(countries, region = "COUNTRY") ## Exploring whiteveg white_veg@data white_veg@polygons[[1]] white_veg@proj4string white_veg@bbox white_veg@plotOrder ## Fortify white shape file for ggplot2 white_veg_fort <- fortify(white_veg, region = "DESCRIPTIO") names(white_veg_fort) length(unique(white_veg_fort$id)) ## Create colour palette for ggplot2 palette_veg_type_19 <- c("#FF4A46","#008941","#006FA6","#A30059","#FFDBE5", "#7A4900","#0000A6","#63FFAC","#B79762","#004D43", "#8FB0FF","#997D87","#5A0007","#809693","#FEFFE6", "#1B4400","#4FC601","#3B5DFF","#4A3B53") ## ggplot ggplot() + geom_polygon(aes(x = long, y = lat, group = group, fill = id), data = white_veg_fort) + geom_polygon(aes(x = long, y = lat, group = group, fill = NA), colour = "black", data = countries_fort) + theme_classic() + scale_fill_manual(values = palette_veg_type_19) + labs(fill = "Biome") + xlab("Longitude") + ylab("Latitude") + coord_map() Then I had to convert the data from Sankaran et al. into a SpatialPoints object so I can use it in future analyses. # Create a data frame with only the latitude and longitude data cover_coords <- cover %>% select(lon_dec_deg, lat_dec_deg) # Important for later on to have lon then lat as columns # Convert to SpatialPoints object cover_spoints <- SpatialPoints(cover_coords,proj4string=CRS(proj4string(white_veg))) The bit of this project that took me some time to work out was how to compute whether a data point from Sankaran et al. fell into a polygon of a certain type in White's map. I ended up using over() from the sp package. Then I can add that data back into the original Sankaran dataset # Add vegetation class column by referencing White map cover_veg_class <- cover %>% mutate(veg_class = over(cover_spoints, white_veg)$DESCRIPTIO) Finally, I can use cover_veg_class to create a ggplot of MAP vs woody cover, with the points coloured according to which of White's vegetation classes the point is in: # Plot with vegetation classification from White et al. 1983 ---- veg_class_plot <- ggplot(cover_veg_class, aes(x = map_mm, y = woody_cover_per, colour = veg_class)) + geom_point(size = 4) + guides(colour=guide_legend(title="Vegetation Class")) + xlab("MAP (mm)") + ylab("Woody Cover (%)") ![MAP vs. woody cover with coloured points by vegetation type](https://johngodlee.xyz/img_full/sankaran/map_wood_col.png)