TITLE: Writing ggplot2 grobs in a loop to maintain data values
DATE: 2019-10-15
AUTHOR: John L. Godlee
====================================================================


Sometimes I encounter the need to make multiple plots in a loop and 
then arrange them in a grid.arrange() panel. A recent example was 
when I had a bunch of ANCOVA linear models in a list, each with a 
different predictor variable, and I wanted to create ggplot() 
objects for each linear model in a for loop, showing the 
distribution of points for the predictor and response variable and 
slopes for a categorical grouping variable, then I wanted to put 
each plot into a single grid image and export it, like the image 
below:

  ![Plots of biomass across different vegetation 
clusters](https://johngodlee.xyz/img_full/ggplot_loop/biomass.png)

The code to generate this involves creating a list for the plot 
objects, then filling the list with each ggplot() called within a 
for loop, where each iteration of the loop is a different linear 
model with a different predictor variable, then calling the list of 
plots with do.call("grid.arrange", c(plot_list)) to build the grid 
image ready to be exported. To access the data needed for the x 
axis of the plot, which changes with each plot/linear model, I took 
the variable name from the linear model it is based on with:

    x_var <- rownames(summary(model_list[[i]])$coefficients)[2]

This means the ggplot() for the scatter plots can then be called 
like this:

    ggplot() + 
        geom_point(aes(x = df[,x_var], y = df[,bchave_log])

You would think this works fine, but when the ggplot() object is 
called again outside the loop, x_var is nowhere to be found, as it 
only exists inside the loop. So instead I had to build each 
ggplot() into a ggplot grob inside the loop where x_var still 
exists before plotting it outside the loop. Building a grob from a 
ggplot() object simply requires wrapping it in 
ggplot_gtable(ggplot_build(...)). Below is an example with inbuilt 
data to illustrate the point:

    # Example data
    df <- mtcars

    # Run a for loop to plot each column aginst column 1
    plot_list <- list()

    for(i in 1:length(df)){
      plot_list[[i]] <- ggplot() + geom_point(aes(df[,i], df[,1]))
    }

    do.call("grid.arrange", c(plot_list, ncol = 2, nrow = 6))

    ##' All the plots are the same because `df[,i]` takes the 
    ##' value from the final loop iteration.

    # Fix the ggplot objects in the loop so they maintain variable 
values
    for(i in 1:length(df)){
      plot_list[[i]] <- ggplot_gtable(ggplot_build(ggplot() + 
          geom_point(aes(df[,i], df[,1]))))
    }

    do.call("grid.arrange", c(plot_list, ncol = 2, nrow = 6))

First, ggplot_build() takes the plot object and produces an object 
that can be rendered as a standalone, with a list of dataframes for 
each plot layer (points, lines) and a panel object containing 
metadata for the plot like axis limits and themes.

Then, ggplot_gtable() builds a grob which can display the image and 
stores it as a gtable. This bit is necessary so that grid.arrange() 
can draw the plot.