Using R and Python in Google Sheets Formulas

Posted by Sam Terfa on Wednesday, May 20, 2020

Photo by <a href="https://unsplash.com/@kmuza?utm_source=medium&amp;utm_medium=referral">Carlos Muza</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a>

Photo by Carlos Muza on Unsplash

Google Sheets are Amazing!

Google’s version of spreadsheets called Google Sheets has a hidden gem which significantly increases the power of your spreadsheets called Google Apps Script. Google Apps Script uses Javascript to call other Google services, create interactive pop-up dialogs, and even make API calls. With a bit of setup, you can create spreadsheet formulas for others to use that execute R, Python, or practically any programming language code! While Shiny and Dash are great ways to share your work, following the instructions in this article, you will be able to create R and Python dashboards directly inside Google Sheets. As a bonus, these same methods allow you to insert R and Python functionality into other Google services such as Docs, Forms and Slides!

Here are a Few Examples

  1. Matrix Multiplication

Image by author

2. Plotting (base R)

Image by author

3. Plotting (ggplot2)

Image by author

4. Deploying a Model

Image by author

5. Any R Script!

Image by author

What is this Magic???

Under the hood, each custom function is a Google Apps script that makes an API call to an endpoint I have set up in Google Cloud Run. If you can set up API endpoints that run R, Python, or any other language scripts, you can import those powers into a Google spreadsheet. This is phenomenal!

Try it Out

The following 6 steps are arranged from easiest to hardest and will simply get you up and running with your own custom spreadsheet function powered by R or Python APIs. There is a lot more to Google Apps script, creating your own APIs, and Google Cloud Run which can be covered in another post.

  1. Create a Spreadsheet in Google Drive

  2. Tools menu-> Script Editor

Image by author

  1. Copy and paste the following code into the editor. Adapt to your needs.

     /*
     * Run an R, Python, or other language script on spreadsheet data.
     *
     * [@param](http://twitter.com/param) var1 The value or cell contents to pass to your API.
     * [@param](http://twitter.com/param) var2 The value or cell contents to pass to your API.
     * [@return](http://twitter.com/return) The results of the API call you're about to make.
     * [@customfunction](http://twitter.com/customfunction)
     */
    function myCustomFunction(var1 = null, var2 = null){
         
      // Assuming your API endpoint is like {baseURL}/{endpoint}.
      var baseURL = '(Copy and paste API base url w/o trailing /)';
      var endpoint = 'myCustomFunction';
         
      // Encode the variable values as JSON (or XML, or something else). 
      // See Google Apps Script UrlFetchApp documentation.
      var data = {
        'var1': var1,
        'var2': var2,
      }
    
    // Set up the API call. Use POST requests to pass variables.
    // You can pass variables as query params of GET requests instead.
      var options = {
        'method' : 'post',
        'contentType': 'application/json',
        'payload' : JSON.stringify(data)
      };
         
      // Make the API call. NOTE: Trailing slashes are important!
      var response = UrlFetchApp.fetch(baseURL + '/' + endpoint + '/', options);
         
      // Parse the response.
      data = JSON.parse(response.getContentText());
        
      // I return "Error: {the error}" on script errors. 
      // Not necessary, but it shows useful error messages in cells.
      if(String(data).substring(0,6) == "Error:"){
        throw(String(data));
      }
        
      return(data);
    }
    
  2. Create a plain text file with no extension named “Dockerfile”. Put it in its own folder on your hard drive. Don’t get nervous; you do not need Docker on your system to follow these steps (but you should absolutely be using Docker in your daily work!)

Copy and paste one of the following into your Dockerfile.

  • R Example 1: Here’s a Dockerfile for an R API using the amazing Plumbr Package. This does not come with Tidyverse, but it creates slim images.

      FROM trestletech/plumber
    
      COPY [".", "./"]
    
      ENTRYPOINT ["R", "-e", "pr <- plumber::plumb(commandArgs()[4]); pr$run(host='0.0.0.0', port=as.numeric(Sys.getenv('PORT')), swagger = T)"]
    
      CMD ["Plumber.R"]
    
  • R Example 2: Here’s a Dockerfile for an R API which DOES come with Tidyverse and Tensorflow (based on the rocker/ml image). This creates bloated images and comes with RStudio. It isn’t meant for production.

      FROM samterfa/rmlsheets
    
      COPY [".", "./"]
    
      ENTRYPOINT ["Rscript", "-e", "pr <- plumber::plumb(commandArgs()[9]); pr$run(host='0.0.0.0', port=as.numeric(Sys.getenv('PORT')), swagger = T)"]
    
      CMD ["Plumber.R"]
    
  • Python Example: Here’s a Dockerfile using the aptly named Fast API .

      FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
    
      RUN pip install pandas
    
      COPY [".", "./"]
    
  1. Create the scripts that will power your custom function. Add the following file to the same folder as your Dockerfile. Note that cell references from Google Sheets are passed in as single values for individual cells, or as nested lists for multiple cells.
  • R API File: Create a file named Plumber.R. You can copy and paste the following as a sample.

      # Swagger docs at ...s/__swagger__/ (needs trailing slash!)
      if(Sys.getenv('PORT') == '') Sys.setenv(PORT = 8000)
    
      #' [@apiTitle](http://twitter.com/apiTitle) R Google Sheets Formulas
      #' [@apiDescription](http://twitter.com/apiDescription) These endpoints allow the user to create custom functions in Google spreadsheets which call R functions.
    
      #* Return the product of 2 matrices
      #* [@param](http://twitter.com/param) var1 An array of values representing the first matrix.
      #* [@param](http://twitter.com/param) var2 An array of values representing the second matrix.
      #* [@post](http://twitter.com/post) /myCustomFunction/
      function(var1, var2){
          
        err <- tryCatch({
            
          return(data.matrix(var1) %*% data.matrix(var2))
            
        }, error = function(e) e)
          
        return(paste0('Error: ', err$message))
      }
    
      #* Confirmation Message
      #* [@get](http://twitter.com/get) /
      function(msg=""){
        "My API Deployed!"
      }
    
  • Python API File: Create a file named Main.py. You can copy and paste the following as a sample.

      from fastapi import FastAPI
      from pydantic import BaseModel
    
      class Matrices(BaseModel):
          var1: list
          var2: list
            
      app = FastAPI()
    
      [@app](http://twitter.com/app).post("/myCustomFunction/")
      def myCustomFunction(matrices: Matrices):
          
        import sys
        import numpy as np
    
        try:
            
          var1 = np.matrix(matrices.var1)
          var2 = np.matrix(matrices.var2)
            
          results = np.matmul(var1, var2)
            
          return np.array(results).tolist()
          
        except:
    
          e_type, e_value, e_traceback = sys.exc_info()
            
          return 'Error: ' + str(e_type) + ' ' + str(e_value)
    
      @app.get("/")
      def myAPIdeployed():
    
          return "My API Deployed!"
    
    NOTE: To run an arbitrary R or Python script inside a Google Sheet, the script needs to determine whether passed arrays include column names or not. The code is a bit unwieldy but I’ve included an R attempt at the end of this article.
  1. Deploy your API (to Google Cloud Run) If you already know how to deploy an R or Python API, you’ve probably already stopped reading this article and are off to the races! For the rest of us, I will guide you through using Google Cloud Run to host your API.

Mark Edmondson has created a phenomenal R package called googleCloudRunner. This package handles auth, creates builds, and deploys containers to Google Cloud Run. It’s truly remarkable. If you don’t have R installed, you can follow these project setup instructions, and these deployment instructions keeping in mind that we are trying to deploy our Dockerfile from above. From here on, I will assume you have R running and googleCloudRunner installed.

  • Go to https://console.cloud.google.com/

  • Agree to Terms and Services if you’ve never created a project before.

  • Click Select a Project and then NEW PROJECT.

  • Name & create the project. I like Rscripts, Pyscripts, or something like that.

  • Make sure the project name is in the upper left dropdown.

  • Click on the menu to the left of the dropdown, click Billing, and then Add billing account. I promise, it’s very cheap!

Image by author

  • For the rest of the set up, follow Mark Edmondson’s instructions here.

  • Identify the path to the folder containing your Dockerfile and API file. Let’s call it “/users/me/my/api/path”.

  • Using R, run the following to deploy your API! The name of the service created inside Google Cloud Run will be the name of the last folder in the folder path, so make sure you like the name and that its name is lowercase.

      googleCloudRunner::cr_deploy_run(local = "/users/me/my/api/path")
    
      # Wait, that's it? Just that?
    
  • A browser will open showing you the progress of your build. If there is an error, you can use the log messages to determine what went wrong. Common issues are missing service account permissions and code problems. If you are comfortable with Docker, you can test locally to make sure there are no code problems. If successful, you should see a message (if you copy and pasted my code).

Image by author

  • Once it’s launched, you need to copy and paste your API URL from above into your Google Apps Script.

Image by author

  • Try it out! Create two numerical arrays in a Google Sheet which could multiply as matrices, start typing “=myCustomFunction(”, highlight the cells, and be impressed with yourself.

Image by author

Resources

  1. Free API Swagger webpagesThese are automagically generated for you because you used the R package Plumber or the Python-based FastAPI.
  • R version is at {YourBaseURL}/swagger/

Image by author

  • Python version is at {YourBaseURL}/docs

Image by author

  1. samterfa/rpysheets GitHub Repo This repo contains all the files listed above plus some bonus material. The bonus material includes the generalized rscript function, as well as the MNIST digit prediction function. If you want to use prediction with large models, you do face a 30 second time limit by Google Sheets. You may also need to beef up your Google Cloud Run service.