Superset Announces Elasticsearch Support!

At Preset, we love open source.

The Superset open-source community is active with daily contributions flowing in from passionate contributors around the world. New features, fixes, code improvements or just quick asks for help - this is open source at it's best. Together, we generate a positive feedback loop that pushes innovation, quality and an inclusive community.

Before we begin, thanks to everyone from the Superset community who requested support for Elasticsearch. Your feedback made it clear that this is an important integration, and we’re excited to share what we’ve built.

Elasticsearch support for SQL Queries

You may already have heard, but Elastic recently released support for SQL queries. SQL being as widespread as it is, this enables all sorts of new integrations.

This got us thinking - what could be the simplest and best solution to add Elasticsearch support on Superset?

Connecting Python and Elasticsearch

Superset is written in Python and uses SQLAlchemy to connect to SQL-speaking databases. Until now, there was no option to connect to Elasticsearch from SQLAlchemy.

We chose the following process for building this:

Implement a python database driver for Elasticsearch that follows the DBAPI specification
Create a SQLAlchemy dialect for Elasticsearch. For context, SQLAlchemy a popular SQL toolkit and Python Object Relationship Mapper (ORM)
Open-source the code for all this
Publish 1 and 2 as a Pypi (Python Package Index) package

With this driver and dialect in the open, Elasticsearch is now accessible to a wide range of existing python applications (including Superset!). Below, we have provided a quick guide to installation with code snippets.

Let's take a look on how to use it.

Installation

pip install elasticsearch-dbapi

If you have an available elasticsearch cluster you can use it, if not you can clone the project and start one (Kibana included):

git clone git@github.com:preset-io/elasticsearch-dbapi.git

cd elasticsearch-dbapi

docker-compose up -d

Now, wait a few seconds and then:

Access Kibana (http://localhost:5601) to import the provided sample Flights data.

Finally, write a simple python script to query our cluster:

from es.elastic.api import connect

conn = connect(host='localhost')

curs = conn.cursor().execute(
    "select * from kibana_sample_data_flights LIMIT 10"
)

print([row for row in curs])

How easy was that?

As a follow up, if you're curious on how to write a Python Database API Specification (dbapi) check out PEP-249.

Integrating Python, Elasticsearch and Apache Superset

Now that you have set up your Python and Elasticsearch, it's time to integrate with Superset. Superset implements an abstract database (DB) specification. This generic DB integration can be specifically implemented to support any kind of DB that has a python driver and is compliant with PEP-249. This specification creates an abstract layer between Superset and any DB engine.

Using this set up, let's see what flight carrier has canceled more flights.

First, let's look at all of the data from the Flights dataset charts and dashboard.

Screenshot of What You can Build with Elasticsearch and Superset Using Flights Dataset

Navigate to SQL Lab.

Protip: there is no concept of schema on Elasticsearch so we expose a "default" schema

It's also important to remember that Indexes are treated like Relational Database Management System (RDBMS) tables.

SQL querying which flight carriers have the most cancelled flights

"JetBeats" is the dreaded winner. Happy Flying and thanks for reading our how to guide and example.

Here are some additional resources for you:

The pull request adding Elasticsearch support to Superset
Learn more and star the Apache Superset Open Source Project
Create your first Dashboard in Apache Superset
The elasticsearch-dbapi Python package, and the related repository on Github
Contribute to Apache Superset

Preset Cloud

Managed Private Cloud

Preset Embedded Dashboards

Preset API

Business Intelligence (BI)

Internal Tooling

Customer-facing Apps

Blog

Documentation

Events

Podcast

What is Superset?

Customers

Superset Announces Elasticsearch Support!

Elasticsearch support for SQL Queries

Connecting Python and Elasticsearch

Installation

Integrating Python, Elasticsearch and Apache Superset

Let’s get visual. Try Preset today.
Start for free

Superset Announces Elasticsearch Support!

Elasticsearch support for SQL Queries

Connecting Python and Elasticsearch

Installation

Integrating Python, Elasticsearch and Apache Superset

Let’s get visual. Try Preset today.Start for free

Subscribe to our blog updates

Let’s get visual. Try Preset today.
Start for free