ANNOUNCEMENTS

Preset joins the Open Semantic Interchange initiative

Beto Dealmeida
5 min read
876 words

If you've worked with data tools for any length of time, you've probably noticed the fragmentation problem. A metric defined in your transformation layer doesn't automatically translate to your BI tool. Your BI tool's understanding of that metric doesn't carry over to your AI assistants. Every time a new tool enters the stack, somebody ends up re-implementing business logic that already exists somewhere else.

This is why we're excited to announce that Preset is participating in the Snowflake-led Open Semantic Interchange (OSI) initiative. The goal is to create an open, vendor-agnostic specification for describing and exchanging semantic models — metrics, dimensions, and relationships — across tools. We'll be contributing to the workgroups defining the spec, helping shape what semantic interoperability looks like in practice.

But joining an industry initiative is only part of the story. We're also making fundamental changes to how Apache Superset handles semantic layers, changes that will make OSI interoperability and semantic layer support in general work much better than it does today.

Why semantic layers have been awkward in Superset

If you've tried integrating a semantic layer with Superset, you've probably noticed the friction. Current integrations (like dbt or Cube) typically represent the semantic layer in Superset as a pseudo-database, which creates confusion. Users can select combinations of metrics and dimensions that aren't actually compatible. Features that work great with regular datasets, like adding custom metrics on the fly, don't make sense when the semantics are already defined elsewhere.

The core issue is that Superset datasets are also a semantic layer. When you add an external semantic layer as a dataset, you're stacking one semantic layer on top of another. That's awkward at best, broken at worst.

First-class semantic layer support in Superset

To address this, I've proposed SIP-182, a Superset Improvement Proposal that introduces first-class semantic layer support. The key idea is simple: instead of forcing semantic layers to pretend they're databases, we'll let them be what they are.

This means a new type of connection specifically for semantic layers, separate from database connections. It means our chart builder will respect the capabilities of the underlying system; if a semantic layer doesn't support custom metrics, you won't see that option. And it means the UI will be smarter about showing you which metrics and dimensions actually work together.

The goal is to let users choose between the flexibility of Superset's native semantic layer (Datasets, where you can define your own metrics and computed columns) or the governance of an external system (where definitions are managed centrally and users consume curated metrics). Both workflows have their place.

Where OSI fits in

The Open Semantic Interchange initiative addresses a different but complementary problem. While SIP-182 is about how Superset consumes semantic layers, OSI is about how semantic layers exchange definitions with each other (at least for now).

Today, if you want to move metric definitions between tools, you're doing it manually or building custom integrations. OSI aims to change that by establishing a common format that any tool can read and write. Instead of building point-to-point integrations between every semantic layer and every consumer, we get a shared interchange format.

For teams using Preset or Apache Superset, this means a few things:

  1. Defining a metric once will actually mean defining it once. When OSI-compatible tools can exchange semantic definitions reliably, you won't need to recreate the same business logic in multiple places.
  2. Migrations become less painful. Moving between tools today often involves exporting what you can, manually recreating what you can't, and hoping nothing got lost in translation. A standard interchange format makes this process mechanical rather than archaeological.
  3. AI applications get better context. One of the challenges with LLM-powered analytics is that the AI needs to understand what your data actually means. When business definitions are explicit and portable rather than implicit and tool-specific, AI agents can work with them more reliably.

What we're contributing

Preset's participation in OSI builds on our existing work integrating Superset with semantic layers like dbt MetricFlow and Cube. We've learned a lot about where the impedance mismatches are — the differences between exploration workflows and presentation workflows, the limitations around how different systems handle dimensional joins — and that experience is directly relevant to designing an interchange format.

We'll be contributing to the specification workgroups, building open adapters between OSI and Superset, and advocating for the kinds of interactive exploration workflows that Superset excels at.

What happens next

On the Superset side, SIP-182 is currently up for a vote within the community. On the OSI side, the specification work is just getting started. As things progress, we'll share more details about how Superset users can take advantage of these changes.

We're grateful to Snowflake for convening this effort. Interoperability rarely happens by accident, requiring vendors to agree that customers benefit more from open standards than from lock-in. We're glad to be part of that conversation, and looking forward to improving the user experience when exploring data with Superset.

Considering which semantic layer to use? Book a call to learn about Preset's current semantic layer support, our plans for the future, and voice your opinion about what you want to see on our roadmap!

Learn about Preset's semantic layer support, our participation in OSI, and share your feedback on our roadmap.

Subscribe to our blog updates

Receive a weekly digest of new blog posts

Close