Photo by Lukas Blazek on Unsplash
Last Time
This is part 2 in my n-part series on using R and Python functions inside Google Sheets formulas. If you haven’t read part 1, Using R and Python in Google Sheets Formulas, I suggest doing so to see all the magic available to you. In summary, you can use R and Python code to power custom Google Sheets cell formulas by setting up a simple API in Google Cloud Run and creating Google Apps Script functions to call your API. After the initial set up, it is quick and easy to use R and Python in any Google Sheet.
Here are some highlights.
Plotting (and more) in Google Sheets
As I said previously, I think Google Sheets are amazing. I stand by that statement with the caveat that graphing in Google Sheets is pretty bad. It took the longest time to even have a reasonable linear regression available, and the graph options are pretty limited. However, now you can use R and Python to create beautiful graphs on the fly inside Google Sheets by using techniques developed previously. In fact, many functions that produce image files can be used inside Google Sheets formulas with minimal additional set up.
Here are some examples.
- base R Plotting
- ggplot2 Plotting
- Python Plotting Using matplotlib
- k-means Cluster Plotting
- Word Clouds
How Does this Work?
The Google Sheets function IMAGE takes the URL of an image as an argument and displays it inside a cell. Based on techniques from part 1, the custom Google Apps Script functions rplot and pyplot make API calls to my /rplot and /pyplot endpoints set up on Google Cloud Run. These API calls run the rscript and pyscript arguments as code, generate a PNG of the output, upload the PNG to imgur.com via their API, and return the URL of the uploaded image.
The easiest way to get started with using rplot or pyplot inside Google Sheets is to clone my rpysheets GitHub repo and use the GoogleCloudRunner R package to deploy a Google Cloud Run service for the rsheets_plotting directory and the pysheets_plotting directory. Then copy and paste the Google Cloud Run service URLs into this Google Sheets template and play.
googleCloudRunner::cr_deploy_run('{LocalPath}/rsheets_plotting/')
googleCloudRunner::cr_deploy_run('{LocalPath}/pysheets_plotting/')
The Details
The arguments rscript and pyscript are the work horses of the /rplot and /pyplot endpoints, and each comes with its own endpoint as well. The rscript argument is read in as a string and run using R’s eval() function, while pyscript is read in and run using a mix of Python’s exec() and eval() functions. I am aware that it is dangerous to allow an untrusted source to pass in code to an eval() or exec() function. Depending on your use case, I would consider how you might modify the code I’ve written to safeguard these functions.
Each variable passed into rscript and pyscript is identified and cast as a dataframe with column names, array, vector, or single value. rscript and pyscript can modify these variables and access built in values as well, but must result in the production of an image. Since R can capture plots as PNG files, this is taken care of behind the scenes, so all that is required is that a plot is attempted. However, since Python does not have a consistent way of capturing plots to PNG that I am aware of, the pyscript code must explicitly save a figure as a PNG file. The first PNG found in the working directory is uploaded to imgur.com behind the scenes.
rplot Google Sheets Formula
=IMAGE(rplot("g <- ggplot(data = a, mapping = aes(x,y)) + geom_point() + geom_smooth(color = 'red') + ggtitle('Sample ggplot') + theme_minimal(); plot(g)", B2:C12))
pyplot Google Sheets Formula
=IMAGE(pyplot("import matplotlib.pyplot as plt; plot = a.plot(x = 0, y = 1, kind = 'scatter', title = 'Sample Python Scatterplot'); fig = plot.get_figure(); fig.savefig('output.png')", B2:C12))
Image Hosting
For both /rplot and /pyplot, I have utilized imgur.com image hosting. You can use a different image hosting site if you like, but to use what I have done above, you will need an imgur Client ID. To obtain one, create an imgur account, and add a client to your account. Once you have the client ID, you can copy and paste it into the above scripts, or you can instead create a .creds directory with the file imgur.json** **inside defined in the following way.
{"clientID":"{YourImgurClientID}"}
I set up my client with Authorization type “Anonymous usage without user authorization” and simply upload publicly available images for demonstration purposes. You may choose to instead upload to a private album. While every update of a graph on a Google Sheet generates a unique imgur URL for the uploaded image, which is wasteful, each client can upload up to 1,250 images per day which has been enough for me. Should you use this in production, you will want to modify images when they change on update versus uploading a different image each time. Keep in mind that if you want to make money using imgur.com, you have special requirements and it is not free.
Resources
-
Here is the Google Sheet you can simply copy and use with custom Google Sheets functions pre-defined for pyscript, pyplot, rscript, and rplot. All that you need to do is paste in your R and Python Google Cloud Run URLs once they are set up.
-
Here is my GitHub repo with the R and Python code referenced above.
Enjoy!