SUPERSET

Running Preset Certified Superset on OpenShift

Preset Team
6 min read
1,118 words

Enterprise teams running Red Hat OpenShift have made a deliberate bet on a container platform that meets their security, compliance, and operational requirements. Now those same teams can bring modern, open-source business intelligence into that environment with Preset Certified Superset (PCS) — a hardened, QA-approved distribution of Apache Superset backed by the team that builds and maintains it.

Why OpenShift for BI?

Organizations that standardize on OpenShift typically operate in regulated industries — financial services, healthcare, government, manufacturing — where infrastructure choices are driven by security policy, not just developer preference. These teams need BI tooling that fits within their existing platform rather than requiring a separate stack or a SaaS exception.

Apache Superset is already container-native, but running it well at enterprise scale requires more than a helm install. It requires validated images, tested upgrade paths, proper operator integration, and support from people who understand both the platform and the product.

That is exactly what PCS provides.

Reference Architecture: Superset on OpenShift

A production-grade Superset deployment on OpenShift builds on the ecosystem of certified operators and platform services that OpenShift teams already know and trust.

Superset Application Tier

PCS ships as a set of validated container images — web server, worker, and beat scheduler — designed to run as standard OpenShift Deployments. These images are built from the same artifacts Preset uses internally, with security patches, dependency updates, and QA validation applied on a bi-weekly cadence.

Key considerations for OpenShift deployments:

  • Security Context Constraints (SCCs): PCS images are designed to run as non-root with restricted SCCs, aligning with OpenShift's default security model
  • Image provenance: Signed, scannable images suitable for environments that enforce image policies through Red Hat Quay or similar registries
  • Resource management: Tested resource requests and limits for predictable scheduling on shared clusters

PostgreSQL as the Metadata Store

Superset requires a relational database for its metadata — dashboards, charts, users, permissions. On OpenShift, the natural choice is a PostgreSQL instance managed by one of the certified operators:

  • Crunchy PGO (Crunchy Data PostgreSQL Operator): Provides automated failover, backups (pgBackRest), connection pooling (PgBouncer), and monitoring — all managed through Kubernetes-native CRDs
  • EDB Postgres for Kubernetes: Another OperatorHub-certified option with enterprise support and Oracle compatibility features

Both operators handle the operational overhead that traditionally falls on the BI team: automated backups, high availability, rolling upgrades, and TLS encryption in transit.

Redis for Caching and Async Queries

Superset uses Redis for caching query results and managing its Celery-based async query pipeline. On OpenShift, Redis can be deployed via:

  • Redis Enterprise Operator: For teams that need persistence, active-active geo-replication, or fine-grained access control
  • Community Redis deployments: Managed through standard OpenShift Deployments or StatefulSets for simpler use cases

A properly configured Redis layer is what makes Superset feel fast at scale — caching dashboard results, storing filter state, and preventing redundant queries against your analytics databases.

Connecting to Your Analytics Databases

Superset's strength is its broad database connectivity. For teams in the OpenShift and IBM ecosystem, several integrations are particularly relevant:

IBM Db2

Many organizations running OpenShift have significant investments in Db2, whether as a transactional system or a data warehouse. Superset connects to Db2 through the ibm_db_sa SQLAlchemy dialect — IBM's official, actively maintained driver — enabling analysts to build dashboards and run SQL Lab queries directly against Db2 without moving data into yet another system.

Superset includes a dedicated Db2 engine spec with support for time grain expressions, schema switching, SSH tunneling, and query cancellation. A few practical notes for OpenShift deployments:

  • Container base image: The ibm_db driver bundles IBM's CLI/ODBC driver, which requires glibc. Use Red Hat UBI (Universal Base Image) or Debian-based images — not Alpine
  • Architecture: The driver ships pre-built wheels for Linux x86_64. Ensure your OpenShift worker nodes run x86_64 for Superset pods that need Db2 connectivity
  • Db2 on z/OS or IBM i: Requires a Db2 Connect license, either activated server-side or via a license file in the container's clidriver/license/ directory
  • SSL/TLS: Supported through connection string parameters for encrypted connections to Db2 instances

This is especially valuable for teams that have been using Cognos or other IBM BI tools and are looking for a modern, open-source alternative that works with their existing data infrastructure.

Other Common Connections

Superset's database connectivity is one of its greatest strengths, with native support for a wide range of SQL-speaking databases. In OpenShift environments, we commonly see connections to:

  • PostgreSQL / Amazon Redshift — Often running alongside OpenShift in hybrid cloud setups
  • Apache Hive / Trino / Presto — For data lake query patterns
  • Snowflake / BigQuery / Databricks — Cloud analytics platforms accessed from on-prem OpenShift clusters
  • ClickHouse — For real-time analytics workloads, increasingly deployed on Kubernetes

Networking and Security

OpenShift provides a rich set of networking primitives that map well to Superset's architecture:

  • OpenShift Routes with TLS termination for the Superset web UI
  • NetworkPolicies to isolate Superset pods, restricting traffic between the web tier, workers, Redis, and PostgreSQL
  • Service mesh integration (OpenShift Service Mesh / Istio) for mTLS between services and fine-grained traffic policies
  • OAuth / OIDC integration through OpenShift's built-in identity provider, or direct integration with enterprise IdPs like Azure AD, Okta, or LDAP

Superset's Flask-based authentication layer supports LDAP, OAuth, and OpenID Connect natively, making it straightforward to integrate with the same identity infrastructure that governs access to the OpenShift cluster itself.

Observability

Superset exposes Prometheus-compatible metrics via StatsD, which integrates directly with OpenShift's built-in monitoring stack (Prometheus + Grafana). This gives operations teams visibility into:

  • Query execution times and throughput
  • Cache hit rates
  • Celery worker queue depth and task latency
  • Application errors and health checks

For teams using the OpenShift logging stack (Loki or EFK), Superset's structured logs can be collected and correlated with platform-level events.

Migrating from Legacy BI

For teams currently running Cognos, MicroStrategy, or other legacy BI platforms — especially those prompted by IBM's evolving product strategy — PCS on OpenShift provides a practical migration path:

  1. Deploy alongside existing tools: Superset runs in its own namespace, with no impact on existing workloads
  2. Connect to existing databases: Point Superset at the same Db2, PostgreSQL, or data warehouse instances your current BI tool uses
  3. Migrate dashboards incrementally: Start with new dashboard requests in Superset while maintaining existing reports
  4. Leverage Preset's migration expertise: The PCS engagement includes hands-on support for teams transitioning from legacy platforms

What You Get with PCS on OpenShift

Capability Details
Validated images Bi-weekly releases, security-patched, non-root compatible
Deployment artifacts Helm charts, Kustomize overlays, and reference configurations tested on OpenShift
Expert support Direct access to Superset engineers for architecture review, troubleshooting, and upgrades
Feature development Custom connectors, plugins, or governance features built to your requirements
Upgrade path Tested migration scripts and guidance for major version upgrades
Training Onboarding for platform teams, data engineers, and business users

Preset Certified Superset brings enterprise-grade Apache Superset to your OpenShift environment — hardened, supported, and backed by the team that builds it. Learn more about PCS

Subscribe to our blog updates

Receive a weekly digest of new blog posts

Close