This is the technical reference for Apache Superset's MCP service. Bookmark it. Come back when you're integrating, deploying, or extending.

If you want the narrative first — what MCP is, why we built it, and what it unlocks for analytics — start with Apache Superset MCP Service: A Library-First Architecture with FastMCP.

Four design decisions shape everything that follows:

Library-first architecture — The MCP service imports Superset as a Python library rather than proxying its web API. Independent scaling, clean async support, and unit tests without a full web context.

Preview-first workflow — generate_chart and update_chart default to non-destructive previews. LLM agents explore and iterate freely without polluting the database. Persistence is always explicit.

Zero privilege escalation — The MCP service cannot grant permissions that don't exist in Superset. Every tool call flows through the same DAOs and Security Manager as the web UI — RBAC, RLS, and column restrictions apply identically.

Token budget management — Responses are bounded at 25K tokens by default. The middleware warns at 80% capacity and blocks at the limit, keeping LLM context windows from being overwhelmed.

Everything else in this post is detail. If you have any questions, please reach out.

Framework: FastMCP 3.1+ · Transport: Streamable HTTP · Protocol: JSON-RPC 2.0 · Tools: 20 · Auth: JWT + RBAC · Python: 3.10+

How MCP Connects AI to Superset

The Model Context Protocol is an open standard released by Anthropic that defines how AI applications discover and invoke external tools. Think of it as a USB-C port for AI — a universal interface that lets any compliant LLM client talk to any compliant tool server.

The Superset MCP service exposes the full analytics capability — dashboards, charts, datasets, SQL execution — as 20 discrete tools that LLM agents can discover, understand, and invoke through structured JSON-RPC calls.

System Architecture

The Library-First Decision

The most important architectural decision: Superset runs as an imported library, not as a Flask web app. The MCP service creates its own minimal Flask context and imports Superset's DAOs, models, and commands directly.

Original Approach

Embed MCP inside Flask via ASGI middleware.

Tight coupling to request lifecycles
Framework conflicts between sync and async
Scaling required scaling the whole web app
Testing required full web context

Current Approach

Standalone FastMCP service that imports Superset as a library.

Independent process and scaling
Clean async support via FastMCP
Unit testing without web framework
Separate deployment and monitoring

Full Architecture Diagram

Request Lifecycle

Every tool invocation follows this precise path through the system:

Every request passes through the auth hook, which loads the authenticated user into Flask's `g.user`. All subsequent DAO operations automatically apply that user's RBAC permissions and row-level security rules — the same permissions enforced in the web UI.

Complete Tool Catalog

Tool	Domain	Purpose	Key Parameters
`list_charts`	Chart	List charts with filtering, search, pagination	`page`, `page_size`, `filters`, `search`, `select_columns`
`get_chart_info`	Chart	Get chart details by ID, UUID, or slug	`identifier`, `form_data_key`
`get_chart_data`	Chart	Retrieve chart query results as text	`identifier`, `format`, `limit`, `use_cache`
`get_chart_preview`	Chart	Generate visual PNG preview of chart	`identifier`, `format`, `width`, `height`
`generate_chart`	Chart	Create chart with optional save	`dataset_id`, `config`, `save_chart`, `chart_name`, `generate_preview`
`update_chart`	Chart	Modify existing chart configuration	`identifier`, `config`, `chart_name`, `generate_preview`
`update_chart_preview`	Chart	Preview changes without saving	`form_data_key`, `dataset_id`, `config`, `generate_preview`
`list_dashboards`	Dashboard	List dashboards with filtering and search	`page`, `page_size`, `filters`, `search`
`get_dashboard_info`	Dashboard	Get dashboard details and chart positions	`identifier`
`generate_dashboard`	Dashboard	Create new dashboard from chart IDs	`dashboard_title`, `chart_ids`
`add_chart_to_existing_dashboard`	Dashboard	Add a chart to an existing dashboard	`dashboard_id`, `chart_id`
`list_datasets`	Dataset	List datasets with metadata control	`page`, `page_size`, `filters`, `search`
`get_dataset_info`	Dataset	Get dataset schema, columns, metrics	`identifier`
`execute_sql`	SQL Lab	Execute SQL with RLS enforcement	`database_id`, `sql`, `schema_name`, `limit`
`open_sql_lab_with_context`	SQL Lab	Generate SQL Lab URL with pre-filled query	`database_connection_id`, `sql`, `schema_name`, `title`
`save_sql_query`	SQL Lab	Save a SQL query to Saved Queries	`database_id`, `label`, `sql`, `description`
`generate_explore_link`	Explore	Generate interactive Explore URL	`dataset_id`, `config`
`get_instance_info`	System	Instance metadata, user info, entity counts	none
`get_schema`	System	Discover filterable and sortable columns	`model_type`
`health_check`	System	Connectivity and version check	none

Chart Tools — 7 tools

7 tools covering the complete chart lifecycle: discovery, retrieval, creation, modification, and preview.

The generate_chart tool implements a 4-layer validation pipeline:

Schema → Dataset → Runtime → Column Normalization

Key patterns:

get_chart_info supports two identifier types: integer ID and UUID string
get_chart_data returns CSV-like text optimized for LLM consumption, not raw JSON
get_chart_preview uses a WebDriver pool for PNG screenshot generation
generate_chart runs a compilation check with row_limit=2 before saving
update_chart_preview is non-destructive — original chart remains unchanged

Dashboard Tools — 4 tools

4 tools for dashboard discovery, inspection, creation, and modification.

get_dashboard_info returns chart positions in the grid layout alongside chart metadata
generate_dashboard auto-generates layout with proper grid positioning
add_chart_to_existing_dashboard auto-calculates the new chart position in the grid

Dataset Tools — 2 tools

2 tools for dataset discovery and schema inspection.

list_datasets includes cache control flags: use_cache, refresh_metadata
get_dataset_info returns full column metadata including types, descriptions, and available metrics

SQL Lab and Explore Tools — 4 tools

4 tools bridging MCP and Superset's query engine.

execute_sql enforces Row-Level Security automatically — the same RLS rules apply as in the web UI
execute_sql supports Jinja2 template parameters for dynamic queries
open_sql_lab_with_context generates URLs with pre-filled query state
save_sql_query persists a query to Saved Queries with an optional label and description
generate_explore_link creates interactive Explore URLs for visual chart building

Core Framework Patterns

The MCP service is built on four reusable core classes that standardize tool behavior and eliminate boilerplate.

ModelListCore — The Workhorse

The most-used core class, powering all list_* tools. A single instantiation handles:

Pagination: 0-based internally, 1-based in requests
Filtering: JSON filter objects or structured filter types
Full-text search: Across configured columns
Column selection: Return only requested columns
Sorting: Any sortable column, asc or desc
RBAC enforcement: Automatic via the DAO layer

# How list_dashboards uses ModelListCore
tool = ModelListCore(
    dao_class=DashboardDAO,
    output_schema=DashboardInfo,
    filter_type=DashboardFilter,
    list_field_name="result",
    search_columns=["dashboard_title", "slug", "uuid"],
    default_columns=["id", "dashboard_title", "slug", "uuid"],
    sortable_columns=["id", "dashboard_title", "slug", "changed_on"],
    item_serializer=serialize_dashboard_object,
    output_list_schema=DashboardList,
)
return tool.run_tool(page=request.page - 1, page_size=request.page_size)

App Factory Pattern

The MCP service uses a factory function to create FastMCP instances:

def create_mcp_app(
    name: str | None = None,
    instructions: str | None = None,
    auth: Any | None = None,
    middleware: list[Middleware] | None = None,
    config: dict | None = None,
    **kwargs,
) -> FastMCP:
    """Create FastMCP instance with custom auth, middleware, etc."""
    return FastMCP(**build_kwargs)

# Module-level singleton
mcp = create_mcp_app()

The factory enables:

Custom auth providers for different environments
Middleware composition for rate limiting, size guards, logging
Configuration inheritance from Superset's app.config

Flask Singleton — Smart Initialization

The MCP service needs a Flask app context for database access but runs as a separate process. The singleton uses intelligent detection:

This enables the MCP service to work correctly whether started:

Standalone: Creates its own minimal Flask app
Embedded: Reuses the existing Superset Flask app

Authentication and Security

Authentication Flow

Authentication Modes

Mode	Config	Use Case	Security
Development	`MCP_DEV_USERNAME = "admin"`	Local development, testing	Single user, no token
RS256 + JWKS	`MCP_JWKS_URI` · `MCP_JWT_ISSUER`	Production with IdP	Key rotation, standard OAuth2
RS256 + Static Key	`MCP_JWT_PUBLIC_KEY`	Production with fixed key pair	Asymmetric, no shared secret
HS256	`MCP_JWT_SECRET`	Simple deployments	Shared secret required

The MCP service cannot grant permissions that don't exist in Superset. If a user is a Viewer in the web UI, they remain a Viewer through MCP. Row-level security, column restrictions, and RBAC all apply identically — because the same DAOs and Security Manager are used.

Middleware Architecture

The middleware pipeline processes every request in order. Each layer can inspect, modify, or reject requests before they reach the tool function. There are 7 layers in total.

Middleware	Purpose	Key Behavior	Default
RateLimitMiddleware	Prevent abuse	Sliding window algorithm, Redis or in-memory	Disabled
ResponseSizeGuardMiddleware	Protect LLM context	Estimates token count, warns at 80%, blocks at 25K	Enabled
FieldPermissionsMiddleware	Column-level security	Strips fields the user cannot access	Enabled
LoggingMiddleware	Audit trail	Logs tool calls with user, params, duration	Enabled
GlobalErrorHandlerMiddleware	Consistent errors	Sanitizes sensitive data from error messages	Enabled
PrivateToolMiddleware	Hide internal tools	Blocks tools tagged as private	Enabled
StructuredContentStripperMiddleware	FastMCP 3.x compat	Strips `structuredContent` from responses for Claude.ai bridge compatibility	Enabled

Error Sanitization

The GlobalErrorHandlerMiddleware automatically redacts sensitive information before it reaches the LLM client:

Database connection strings — replaced with generic messages
API keys and tokens — redacted from error traces
File system paths — stripped to prevent information disclosure
IP addresses — removed from error context

All regex patterns are bounded to prevent ReDoS attacks.

Token Budget Management

The ResponseSizeGuardMiddleware prevents overwhelming LLM context windows:

# Configuration in superset_config.py
MCP_RESPONSE_SIZE_CONFIG = {
    "enabled": True,
    "token_limit": 25000,
    "warn_threshold_pct": 80,
    "excluded_tools": ["health_check", "get_chart_preview", "generate_explore_link", "open_sql_lab_with_context"],
}

When a response exceeds the limit, the middleware returns a structured error suggesting the client use pagination or column selection to reduce the response size.

Preview-First Workflow

A critical design pattern that respects the iterative, conversational nature of AI interactions.

Traditional API expects precise requirements upfront, persists every call to the database, and assumes minimal iteration. LLM conversations are exploratory — preview first, refine, then save, with no database clutter from exploration.

How it works in practice:

# Step 1: Explore (no database write)
response = generate_chart(dataset_id=42, config={...}, save_chart=False)
# Returns: explore URL + form_data_key for iteration

# Step 2: Iterate (still no database write)
response = update_chart_preview(chart_id, config_updates={...})
# See changes instantly without committing

# Step 3: Save (explicit persistence)
response = generate_chart(dataset_id=42, config={...}, save_chart=True)
# NOW the chart is persisted to the database

Configuration Reference

Parameter	Default	Description	Required
`MCP_SERVICE_HOST`	`localhost`	Host to bind the MCP server	No
`MCP_SERVICE_PORT`	`5008`	Port for the MCP server	No
`MCP_DEV_USERNAME`	none	Username for development mode auth	Yes in dev
`MCP_DEBUG`	`False`	Enable debug logging	No
`MCP_AUTH_ENABLED`	`False`	Enable JWT authentication	No
`MCP_JWT_ALGORITHM`	`RS256`	JWT signing algorithm	No
`MCP_JWKS_URI`	none	JWKS endpoint for key rotation	If RS256
`MCP_JWT_PUBLIC_KEY`	none	Static RSA public key	If RS256 static
`MCP_JWT_SECRET`	none	HMAC shared secret	If HS256
`MCP_JWT_ISSUER`	none	Expected JWT issuer claim	Production
`MCP_JWT_AUDIENCE`	none	Expected JWT audience claim	Production
`MCP_PARSE_REQUEST_ENABLED`	`True`	Enable flexible JSON/object input parsing	No
`MCP_RESPONSE_SIZE_CONFIG`	25K tokens	Token limit for response size guard	No
`MCP_STORE_CONFIG`	disabled	Redis config for multi-pod sessions	Multi-pod
`MCP_CACHE_CONFIG`	disabled	Response caching with TTL	No

Deployment Guide

Standalone Process

The simplest deployment — run alongside the Superset web server:

# Terminal 1: Superset web server
superset run -p 9001 --with-threads --reload

# Terminal 2: MCP service
superset mcp run --port 5008 --debug

Docker Compose

Add an MCP service to your Docker Compose setup:

# Add to your docker-compose.yml
mcp:
  image: apache/superset:latest
  command: ["superset", "mcp", "run", "--host", "0.0.0.0", "--port", "5008"]
  ports:
    - "5008:5008"
  environment:
    - PYTHONPATH=/app
    - SUPERSET_CONFIG_PATH=/app/superset_config.py

Multi-Pod with Redis

For high-availability deployments, Redis provides shared session state across pods:

# superset_config.py
MCP_STORE_CONFIG = {
    "enabled": True,
    "CACHE_REDIS_URL": "redis://redis:6379/0",
    "event_store_prefix": "mcp_events_",
    "event_store_max_events": 100,
    "event_store_ttl": 3600,
}

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: superset-mcp
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: mcp-service
        image: apache/superset:latest
        command: ["superset", "mcp", "run", "--host", "0.0.0.0", "--port", "5008"]
        ports:
        - containerPort: 5008
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 5008
          initialDelaySeconds: 30

Extension System

Extensions can register custom MCP tools using the plugin decorator system — new tools automatically inherit the authentication, middleware, and security model.

from superset_core.mcp.decorators import tool

@tool(name="my_extension.custom_analytics")
def custom_analytics(dataset_id: int, metric: str) -> dict:
    """Run custom analytics on a dataset.

    Computes specialized metrics not available
    in standard Superset charts.
    """
    from superset.daos.dataset import DatasetDAO

    dataset = DatasetDAO.find_by_id(dataset_id)
    # Custom logic here...
    return {"result": computed_value}

Checklist for adding a new tool:

Create tool file in superset/mcp_service/{module}/tool/{name}.py
Add Apache license header
Decorate with @tool from superset_core.mcp.decorators
Create Pydantic request/response schemas
Use @parse_request for flexible input handling
Add dataset validation for chart-related tools
Import the tool in app.py — tools won't register otherwise
Write unit tests in tests/unit_tests/mcp_service/

Codebase Map

superset/mcp_service/
├── app.py                    # FastMCP factory + tool registration
├── server.py                 # Server runner (uvicorn)
├── auth.py                   # JWT auth + user resolution
├── mcp_core.py               # Generic core classes
├── middleware.py              # 7 middleware layers
├── flask_singleton.py         # Smart Flask app initialization
├── mcp_config.py              # Default configuration
├── jwt_verifier.py            # JWT token validation
├── caching.py                 # Response caching
├── storage.py                 # Redis store management
│
├── chart/
│   ├── schemas.py            # Pydantic models
│   ├── chart_utils.py        # Validation helpers
│   ├── preview_utils.py      # Screenshot generation
│   ├── validation/           # 4-layer validation pipeline
│   └── tool/                 # 7 chart tools
│
├── dashboard/
│   ├── schemas.py
│   └── tool/                 # 4 dashboard tools
│
├── dataset/
│   ├── schemas.py
│   └── tool/                 # 2 dataset tools
│
├── sql_lab/
│   ├── schemas.py
│   └── tool/                 # 3 SQL Lab tools
│
├── explore/
│   └── tool/                 # 1 explore tool
│
├── system/
│   ├── schemas.py
│   └── tool/                 # 3 system tools
│
├── common/                   # Shared error schemas
└── utils/                    # Schema parsing, permissions, caching

Performance and Observability

Connection Pooling

Each MCP service process maintains a SQLAlchemy connection pool. For multi-worker deployments, total connections = workers × pool size.

# superset_config.py
SQLALCHEMY_POOL_SIZE = 5
SQLALCHEMY_MAX_OVERFLOW = 10
SQLALCHEMY_POOL_TIMEOUT = 30
SQLALCHEMY_POOL_RECYCLE = 3600

Response Caching

Optional response caching reduces database load for repeated queries:

In-memory: Default backend, single-process only
Redis: For multi-pod deployments
TTL: 300 seconds default, configurable per operation

Tool Search Optimization

For context-constrained LLM clients, tools can be loaded in deferred categories:

core — Always loaded: list_charts, list_dashboards, list_datasets, get_instance_info, health_check
discovery — Load on demand: get info, get schema
data — Load on demand: previews, chart data
mutate — Load on demand: generate, update, execute
explore — Load on demand: URL generators

This provides up to 85% token savings in the initial tool discovery phase, reducing from 15–20K tokens down to 4–5K.

Contact our team to set up a demo and see what AI-native analytics looks like in practice.

For the full narrative on architectural decisions, see the companion article: Apache Superset MCP Service: A Library-First Architecture with FastMCP

Running Preset? See Preset MCP: From Open Source to Enterprise — how Preset extends this architecture with multi-tenant isolation, OAuth 2.0, and built-in Chatbot.

Source code: github.com/apache/superset · MCP Protocol: modelcontextprotocol.io · FastMCP: gofastmcp.com

Apache Superset MCP Service: A Technical Deep Dive