2025-07-10

Open source data lakehouse architecture with Spark and Kyuubi – an engineering deep dive

A detailed exploration of an open source data lakehouse architecture and how we implement it at Canonical

Register now

About the webinar

As data volumes increase rapidly, data lakehouses have become critical for businesses to merge analytics with storage. However, many of these are closed-source, leading to vendor lock-in and decreased flexibility, which can inhibit customization as well as integration with other tools.

Join us for an engineering deep dive exploring how Apache Spark and Apache Kyuubi work together to power open source data lakehouses. Discover how Spark’s scalable processing engine and Kyuubi’s user-friendly SQL gateway enable efficient, secure, and high-performance analytics on unified data sets. We’ll dig deeper into how this combination simplifies big data storage, interactive analytics, and ETL – all through a single, streamlined open source lakehouse architecture

:white_check_mark: In our webinar you will learn more about:

  • Apache Spark and Apache Kyuubi
  • Data lakehousing practical implementations
  • Our reference architecture for a truly Open Source data lakehouse

We will also have a live Q&A where you’ll be able to ask our experts questions on the architecture.

Interested in exploring further resources?

The CTO`s guide to Big Data and AI solutions: learn how to build a smarter enterprise with an integrated open source stack.