DATA LITERACY

The Promise of MCP-Powered Data Workflows

Maxime Beauchemin

At Preset, we’re making a big bet on MCP (Model-Context Protocol), and it’s not just us. We’ve been hearing the same excitement from the LLM-providers, partners and early adopters: this protocol has the potential to reshape how we work with data, tools, and AI. The old model of copy/pasting between browser tabs and chatbot sessions is giving way to something far more powerful: a mesh of interoperable services where LLMs can reason and act across systems on your behalf.

We're writing this post to share our vision, grounded in the kinds of real-world workflows we believe will soon be not just possible, but expected. With MCP, we’re entering a world where AI can operate fluidly across systems, tools, and services. Where users don’t just ask questions—they co-create, debug, explore, and automate in ways that would’ve been unthinkable just a year ago.

For all the hype around RAG (Retrieval-Augmented Generation), let’s be honest: it’s starting to feel like trying to clean a modern data stack with a dirty rag. It works in limited contexts, but it can’t see across tools or take action beyond the scope of a single interface. It’s constrained by what it can retrieve and by where it’s allowed to act. MCP breaks that boundary.

Below are a series of imagined, but entirely plausible, MCP-enabled sessions. They’re grounded in today’s tools and challenges, and they show what’s coming as Preset and other ecosystem players continue to invest in this shared layer of intelligence.


1. Business Question → Insights → Dashboard Creation

Actors: Marketing Analyst, Customer-provided LLM, Snowflake MCP, Preset MCP

How has our paid acquisition been performing compared to organic over the past 6 months?
Scanning for relevant semantic models in Snowflake via MCP...
Found the traffic_acquisition_summary semantic view
Querying the view...
Here's your acquisition performance over the past 6 months:
View Table
Key Trends: Paid acquisition is driving 60-63% of traffic with improving efficiency (CPA down 12%). Organic growth is steady at ~20% month-over-month.
Visualize that split over time
Generating line chart visualization...
Here's your acquisition split visualized over time: [Interactive Line Chart showing Paid vs Organic user acquisition trends with clear trend lines and data points for the past 6 months]
Can I dig deeper in Preset?
Creating new chart in Preset...
Configuring interactive filters...
I've created an interactive chart in Preset for deeper exploration:
Open in Preset
You can now filter by campaign, channel, geography, and time periods to explore the data further.
Make a new dashboard with this and add CPA by campaign and CTR by landing page
Assembling dashboard components...
Adding CPA by campaign chart...
Adding CTR by landing page visualization...
Coordinating with Preset MCP...
Done. Refresh your dashboard tab to see the new comprehensive acquisition analytics dashboard with all three visualizations.

2. Root Cause Analysis → Visualization

How has our paid acquisition been performing compared to organic over the past 6 months?
Scanning for relevant semantic models in Snowflake via MCP...
Found the traffic_acquisition_summary semantic view
Querying the view...
Here's your acquisition performance over the past 6 months:
View Table
Key Trends: Paid acquisition is driving 60-63% of traffic with improving efficiency (CPA down 12%). Organic growth is steady at ~20% month-over-month.
Visualize that split over time
Generating line chart visualization...
Here's your acquisition split visualized over time:
View Chart
Interactive line chart showing Paid vs Organic user acquisition trends with clear trend lines and data points for the past 6 months.
Can I dig deeper in Preset?
Creating new chart in Preset...
Configuring interactive filters...
I've created an interactive chart in Preset for deeper exploration:
Open in Preset
You can now filter by campaign, channel, geography, and time periods to explore the data further.
Make a new dashboard with this and add CPA by campaign and CTR by landing page
Assembling dashboard components...
Adding CPA by campaign chart...
Adding CTR by landing page visualization...
Coordinating with Preset MCP...
Done. Refresh your dashboard tab to see the new comprehensive acquisition analytics dashboard with all three visualizations.
Help me dig into why this metric spiked in Q2?
Breaking down metric by region, channel, product...
Analyzing data patterns...
Identifying spike drivers...
Found the root cause! The new 'Professional Services' SKU explains 90% of the Q2 spike. Would you like to see this in a waterfall chart or breakdown bar chart?
Waterfall
Building waterfall chart in Superset...
Embedding visualization in chat...
Here's your waterfall chart showing the Q2 metric breakdown:
View Waterfall Chart
The Professional Services SKU contributed +$2.4M to the metric increase, with other factors contributing minimal changes.
Explore in Superset
Who's selling PS? Can you break it down by sales rep and by customer?
Querying sales data...
Breaking down by sales rep and customer...
Here's the Professional Services breakdown by sales rep and customer:
View Data Table
Top performers: Sarah Chen ($680K), Mike Rodriguez ($520K), Jennifer Park ($445K). Enterprise customers driving 70% of PS revenue.
Open in Preset

3. Semantic Layer Evolution → Collaboration

Actors: Business Analyst, LLM, Preset MCP, Semantic Owner via Slack

Can I break this down by age group?
Checking current semantic model for age_group...
Age group not present in current view
Inspecting upstream tables...
Found customer_details.age_group!
I found that age_group isn't in the current Cube semantic model, but it's available in customer_details.age_group which can easily be joined. I've reverse-engineered the semantic view and built an extended SQL version as a workaround:
View Extended Query
Run Analysis with Age Groups
How do we make this permanent? Can we contribute it back to the semantic view definition?
Analyzing semantic view modification options...
I can help you make this permanent! Here are two paths:

Help me modify the view definition myself
Ping the owner for me

Which approach would you prefer?
Please ping the owner on Slack
Fetching owner from Preset's metadata MCP...
Found owner: @sarah.kim
Crafting Slack DM...
✅ Slack message sent to @sarah.kim:
"Hey @sarah.kim, Alex Chen needs age_group added to customer_perf_view. Context: We found customer_details.age_group can be easily joined to enable age-based breakdowns. I've already built a working extended query as a proof-of-concept. Can you help add this to the semantic model? Happy to share the SQL if useful!"
I've included context about the analysis and offered to share the working SQL. You should hear back soon!

4. Proactive Assistance from the LLM

Actors: Data Analyst, LLM, Preset MCP

How's my dashboard looking?
Dashboard loaded, analyzing metrics...
Detecting anomalies in recent data...
Noticed a 22% drop in conversion rate last week—want me to investigate?
Sure
Opening exploration tab...
Drilling down by segment...
Analyzing funnel performance...
Flagging related anomalies...
Found the root causes! Here's what I discovered:
Mobile segment: 31% conversion drop
Session duration: Down 18% average
Checkout abandonment: Up 45%
View Drill-Down Analysis
Suggested next steps:
• Add device type and browser filters
• Include page load time metrics
• Monitor payment gateway errors
Apply Suggested Filters
Create Alert for These Metrics

5. From SQL to Workflow

Actors: SQL-savvy Analyst, LLM, Preset MCP, dbt MCP

Hey, I have this SQL that I've been working on. Can you check if there are any dbt models with similar logic?
SELECT customer_id, order_date, total_amount, payment_method, discount_applied, customer_lifetime_value, acquisition_channel, geographic_region FROM orders o LEFT JOIN customers c ON o.customer_id = c.id WHERE order_date >= '2024-01-01'
Analyzing SQL logic...
Scanning dbt models for similar patterns...
Found match: customer_orders_enhanced model
This matches our existing customer_orders_enhanced dbt model, except for a few fields you added. Want to augment the dbt model to add in:
• customer_lifetime_value
• acquisition_channel
• geographic_region
This would make the enhanced logic available to everyone!
Yes
Creating model enhancement...
Generating SQL modifications...
Opening PR in GitHub repo via dbt Cloud MCP...
✅ Pull request created successfully!
View PR on GitHub
Review Changes
Want me to create a temporary model in staging to use until this gets merged?

6. Embedded LLM → Preset Extension

Actors: Embedded App User, Host LLM, Preset MCP

Who are our highest-value customers in the Midwest over 65?
Querying Superset charts and tables via Preset MCP...
Here's a ranked table of your highest-value customers in the Midwest over 65:
View Ranked Table
Export CSV
Can you notify our account managers?
Integrating with Slack MCP...
DMing relevant team members...
✅ Account managers have been notified via Slack DM with the customer list and analysis.

7. Audit Trail + Governance

Actors: Data Steward, Admin LLM, Preset MCP

Who modified the customer retention dashboard last?
Auditing dashboard history via Preset MCP...
Changes made by @jane.doe on Aug 12, reverted by @john.smith on Aug 15
What changed?
Analyzing differences between versions...
Here's the diff of SQL and visual config changes:
View SQL Diff
View Visual Config Diff

8. Multi-Agent Collaboration

Actors: Product Analyst, LLM, Preset MCP, GitHub MCP, Slack MCP

I want to automate a weekly report of trial signups
Identifying trial signup metrics in Superset...
Creating chart...
Building dashboard...
Setting up report schedule...
Creating PR via GitHub MCP to document report definition...
Using Slack MCP to notify team...
✅ Weekly trial signup report is now automated! Used `/remind` in Slack: "Trial report going live, FYI."
View Dashboard
View Schedule Settings

9. Cross-Stack Intelligence with Expanded MCP Mesh

Actors: LLM, Preset MCP, DataHub MCP, dbt MCP, Airflow MCP, Governance MCP

Can I trust this chart showing a huge spike in churn rate?
Checking Superset chart lineage via Preset MCP...
Tracing back to dbt model...
Querying DataHub MCP for freshness, owners, and upstream sources...
Detecting stale data from delayed Airflow DAG...
Using Airflow MCP to confirm job failure...
Consulting Governance MCP for PII compliance...
This chart is based on stale data due to an ETL delay. Source tables are owned by @data.engineer. DAG `customer_churn_daily` failed due to a missing partition.
View Complete Provenance
View Trustworthiness Summary
Can you ping the owner and file a Jira ticket?
Reaching out via Slack MCP...
Opening ticket using PM tool MCP...
✅ Notified @data.engineer on Slack and created Jira ticket for the missing partition issue.

10. End-to-End Debugging Across GitHub, dbt Cloud, and Superset

Actors: Data Analyst, LLM, Preset MCP, dbt MCP, GitHub MCP, Airflow MCP

Why isn't this dashboard updating anymore?
Checking Superset dashboard metadata via Preset MCP...
Detecting stale data...
Tracing lineage to dbt Cloud job via dbt MCP...
Found failed run, digging into logs...
Identified model dependency failure...
Using GitHub MCP to locate model code in dbt repo...
Found recent change, tagging PR author...
This join condition looks off—want to fix it now?
View Failed Model
View PR Draft
Yes, suggest fix and tag the author
Pushing suggested fix...
Pinging author with context...
✅ Created PR with fix and messaged: "Hey @teammate, looks like this broke the dashboard—mind reviewing this patch?"

Future Possibilities

  • Multi-user sessions: Teams co-pilot with an LLM in shared sessions
  • Audit bots: LLM agents watching for anomalous changes to critical dashboards
  • Compliance watchdogs: Monitoring for PII exposure via semantic metadata + lineage tracking
  • MCP search portal: Global, natural language search across all charts, views, dashboards, and SQL snippets across org

This is just the beginning. MCP turns chat-based AI from a reactive assistant into a proactive collaborator that understands your tools and workflows. We're just getting started—and with the right mesh of services coming online, a new era of fluid, AI-augmented work is finally within reach.

Our Role: Building the Preset MCP

At Preset, we see our role in the MCP ecosystem as building a robust, high-fidelity interface that lets LLMs do nearly everything a user can do in Preset—securely, auditable, and with full governance. Our goal is to make Preset not just LLM-compatible, but LLM-native.

We’re designing the Preset MCP to:

  • Expose rich functionality across charting, dashboards, metadata, alerts, and lineage—letting LLMs build, update, and navigate Superset objects the way a power user would.
  • Respect user context and security: every interaction is scoped to the user's identity and permissions, with fine-grained audit logs and safety rails in place.
  • Interoperate across the stack: by using global identifiers (e.g. UUIDs) and shared metadata conventions, we enable LLMs to fluidly reference the same dashboards, metrics, and models across Preset, dbt, DataHub, and more.
  • Integrate with your LLM(s) of choice: whether you’re self-hosting, using a cloud provider, or routing via a broker, our vision includes support for plug-and-play LLM connectivity with optional access controls and observability.
  • Embed AI in context: our in-product chatbot knows what you're doing—it sees your dashboard, your filters, recent error message, your history—and it can guide, automate, or escalate with full awareness of your workflow.

As governance-focused MCPs emerge (e.g. for access control, PII scanning, policy enforcement), we aim to be a clean endpoint they can interface with—logging actions, enforcing boundaries, and participating in organization-wide decision logic.

Note that with a robust MCP in place, building an in-product chatbot becomes almost trivial. The chatbot can be pre-configured and context-rich—aware of the user’s current view, filters, active SQL query, or even a recent error message. If you're in SQL Lab debugging a failed query, it already knows. If you're exploring a dataset and ask a question, it knows which one.

There are tradeoffs: the in-product bot may not be wired into your broader mesh of MCP services, or it might use a Preset-provided model instead of your org’s preferred one. But that’s the point—it’s about giving you options. Organizations can configure which LLMs are used, which users can access which assistants, and which MCP services are exposed. Whether it’s embedded, external, or both, the foundation is the same.

We’re not trying to own the LLM. We’re trying to be the best possible service an LLM could call when it needs to understand or act within Preset.

Until All MCPs Unite

This post paints a vision where all your tools speak MCP, and your LLMs have seamless access across the full stack. That’s the future—and we’re moving toward it. But even before every system joins the mesh, there’s a ton of value in wiring up just one or two MCPs to your favorite LLM.

Even with limited context, the workflows unlocked by a single, well-integrated MCP are game-changing. In our experiments with superset-mcp, we’ve already seen how a focused integration can unlock rich, tool-contained use cases that feel like magic compared to the old copy/paste world.

Here are just a few powerful examples:

  • Conversational charting: generate charts from scratch using natural language—no menus, no clicks.
  • Dashboard crafting & augmentation: ask to add a new metric, breakdown, or filter, and the LLM updates your dashboard on the fly.
  • Workflow assistance: go straight from a prompt to the right data exploration, prefilled SQL Lab, alerts and reports set up without touching a form.
  • Anomaly detection & root cause analysis: LLMs can spot issues and guide users through exploration to uncover what happened and why.
  • Help & explanations: ask how something works or what you’re looking at, and get a clear, contextual answer.
  • Semantic search: find the chart, dashboard, or dataset you saw last week—even if you forgot the name. Ask open-ended questions like “Do we have any reliable dashboards around customer satisfaction?” and let the LLM surface the most relevant content with a reliability assessment based on the related metadata.
  • Onboarding assistant: walk new users through your workspace, highlight what matters, and help them self-serve insights from day one.

Even one good MCP turns your data app into something far more powerful: a programmable, assistive surface. And the more MCPs join the mesh, the more that surface becomes a canvas for true cross-system reasoning and automation.

At Preset, we’re taking MCP seriously, and we’re already in the prototyping phase with many of the above features. If you’re excited about the above, or have other ideas you want to share, reach out to our team and let us know!

Subscribe to our blog updates

Receive a weekly digest of new blog posts

Close