SUPERSET DEVELOPERS

Stop Forking Around - The Hidden Dangers of “Fork Drift” in Open Source Adoption

Evan Rusackas

When organizations discover an open-source project that almost meets their needs, they often face a critical decision: contribute improvements upstream or fork the codebase and customize it privately. While forking might seem expedient—offering immediate control and customization—it creates a ticking technical debt bomb that we've seen explode time and again.

Avoiding a Terrible Forking Situation

"Fork drift" occurs when your customized codebase diverges so significantly from the upstream project that rebasing becomes increasingly difficult or practically impossible. As time passes, the original project evolves, file structures change, dependencies update, and architectural decisions shift. Meanwhile, your fork accumulates its own parallel history of changes.

The consequences can be severe:

  • Missing critical security patches that protect your data and users
  • Inability to adopt new features that could provide competitive advantages
  • Growing technical debt as workarounds accumulate
  • Increased maintenance burden on your development team as cherry picking and rebasing compounds (possibly exponentially) with each change
  • Difficulty recruiting developers familiar with your customized version
  • Orphaned functionality when upstream deprecates components you rely on

A Common Pattern with Apache Superset

We've witnessed this pattern repeatedly with Apache Superset implementations. Organizations fork the codebase, make extensive modifications to suit their specific requirements, and initially celebrate their customized analytics platform. Fast forward 12-18 months, and these same teams find themselves unable to incorporate important security fixes or leverage exciting new features and improvements from the community.

What begins as a shortcut to customization becomes a long-term maintenance nightmare.

Breaking the Cycle: The Open Source Way

At Preset, we're deeply invested in the quality and future of the open-source codebase. In fact, we are perhaps the leading stewards of the Apache Superset project, having grown or hired more Superset PMC members and Committers than any other organization in the project's history. We want people to be successful with the codebase, benefit from updates, and bring features back to the core codebase, so everyone in the community can benefit.

Instead of forking, consider these alternatives:

  1. Contribute enhancements upstream: Work with the community to incorporate your improvements directly into the main project, ensuring everyone benefits while maintaining upgrade compatibility.
  2. Develop modular extensions: Create plugins or extensions that can be maintained separately without modifying core code. This isn’t yet easy to do, but there’s a plan in place, so you can expect more news on how to do this soon.
  3. Prioritize configuration over customization: Explore whether your requirements can be met through configuration options rather than code changes. Superset already offers many configuration options, and we can expand and enhance these options to make the platform more flexible.
  4. Participate in the roadmap: Engage with the community to influence future development priorities. If you want to propose significant changes, you can contribute to our SIP process (tracked here) and subscribe to the community calendar to join events like Superset Town Hall and other operational model working groups.

Rescuing Stranded Forks

If you're already experiencing fork drift with Superset, you're not alone. We frequently hear from organizations in this exact predicament, and we're here to help. Preset offers professional services specifically designed to help teams:

  • Conduct comprehensive "fork audits" to assess divergence
  • Develop strategies to reintegrate valuable customizations into the upstream project
  • Create migration paths to bring your analytics capabilities back into the main ecosystem
  • Contribute your innovations back to the community (when appropriate)

With Preset you can focus on continuing BI and feature development, and we can manage Superset. This lets you focus on driving your business forward, not maintain an increasingly archaic fork, while enjoying Preset’s numerous differentiators.

Conclusion

While forking provides immediate gratification, the long-term costs often outweigh the benefits. By collaborating with open-source communities rather than diverging from them, organizations can enjoy both customization and sustainability.

If you're struggling with a Superset fork that's drifted too far from the source, reach out to us. As the primary stewards of Apache Superset, we're committed to helping you find a path forward that preserves your innovations while reconnecting you to the benefits of the broader ecosystem.

Together, we can turn a "forking bad idea" into an opportunity for community contribution and sustainable analytics.

If you want to get back on track with the latest features in Superset (or leverage Preset’s differentiators), we’re happy to help you get your features back into the core codebase. Set up a call to learn more!

Subscribe to our blog updates

Receive a weekly digest of new blog posts

Close