No Code

Getting Started - Connect Superset To Google Sheets

Tai Dupree

Utilizing an external database such as Druid or Postgres requires that you have access to such a database or that you spin one up. Although docker makes this easier than ever, it can still require some technical knowledge and time that could be spent evaluating Superset. In this post, we'll cover how to install the gsheetsdb driver that can be used to query data in Google Sheets.

If you haven't already done so, check out our previous posts in our Getting Started series: Getting Started - Installing Apache Superset and Getting Started - Installing Additional Drivers.

Now that you have Superset up and running on your machine and know how to install some database drivers, let's query some data from Google Sheets.

Credit

Most of the work making this blog post possible was contributed by Beto de Almeida, one of Superset's prolific contributors. Beto wrote the underlying DBAPI driver, the SQLAlachemy dialect as well as Superset's compatibility layer to make all of this possible. This blog post is simply showing how to make that integration work in your environment.

Installing the gsheetsdb driver in your local environment

As covered in Getting Started - Installing Additional Drivers we'll add a driver to the ./docker/requirements-local.txt file and rebuild the Superset docker container.

# Add the `gsheetsdb` driver
echo "gsheetsdb" >> ./docker/requirements-local.txt

# Rebuild your local image with the new driver baked in
docker-compose build --force-rm

# Fire things back up
docker-compose up

Adding a Google Sheet as a database in Apache Superset

Now that we have the required driver installed, we can add the database in the Superset UI. Navigate to add a new database connection (Source > Databases > +). Fill out the form by giving the database a name and entering the connection string gsheets://.

NOTE: Testing the connection will throw an error as we are not connecting to an actual database that can respond to our request.

Hit Save. You are now ready to start querying Google Sheets.

Running a query in SQL Lab

With the Google Sheets connection created you can now query any publicly available spreadsheet. Head over to SQL Lab (SQL Lab > SQL Editor). The creator of the gsheets driver, Beto De Almeida, has made a few example sheets available for testing which we will use.

  1. Select Google Sheets (or the name you gave it in the previous step) from the dropdown list of available databases on the left.
  2. Enter the following query in the query editor

    SELECT * FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/"

    and hit run. You should see results come back.

You have now successfully queried from Google Sheets! You can now make a copy of this sheet, make edits, find other publicly available sheets, or you can create your own (just make sure the sharing settings allow anyone with the link to view).

Creating a Chart

In order to create a Chart that uses Google Sheets as its datasource we will have to add the url as a datasource. Navigate to the Add Table form (Sources > Tables > +) and fill out the form by selecting the Google Sheets database and the url as the Table Name.

Hit Save. You are now ready to start creating charts against this datasource.

Head over to the chart creation page (New > Chart) and select the Google Sheets datasource and Bar Chart as the Visualization Type the hit Create New Chart, which should take you to the explore page. An error will appear on the right saying Control labeled "Series" cannot be empty, which is telling us that we must select a series. Find the Series control (which should be marked red) and select "Country", then hit Run Query and you should see a bar chart with the counts of each time a country appears in the dataset.

You can change the metrics and series on the left to find other interesting stories in your data. For example, try removing the COUNT(*) metric and adding SUM(cnt).

Congratulations you have now created a Chart from Google Sheets data! Using Google Sheets is an excellent way to get to visualizing something quickly, especially if you already have data stored in Google Drive or a csv file.

NOTE: Running queries against Google Sheets is significantly slower than against a database such as PostgreSQL or MySQL so you should only use this approach for smaller datasets.

If you'd like a more production-ready setup for Superset, we are buildling that hosted cloud solution at Preset. The solution is currently in alpha, and you can sign up to try it out here. It's a great opportunity for you to help shape this offering, influence our ambitious Superset roadmap, and access Preset’s expertise as an extension of your team.

If you have any other questions, hit us up at contact@preset.io or join our community Slack channel

Happy dashboarding!

Preset will be available soon. Be the first to know.