Hello, and welcome to the April 2025 ClickHouse newsletter!
This month, we bring you CloudQuery's compelling experience report after 6 months with ClickHouse, unveil the powerful new query condition cache in 25.3, reflect on our year of Rust development, announce our strategic acquisition of HyperDX, and more!
Featured community member: Julian LaNeve
This month's featured community member is Julian LaNeve, CTO at Astronomer.
Before stepping into the CTO role in November 2023, Julian worked in the product team, focusing on developer experience, data observability, and open-source initiatives. Notably, he led the launch of Astronomer's Cloud IDE - a notebook tool designed for writing data pipelines.
Julian recently wrote a blog post describing why Astronomer chose ClickHouse Cloud to power its new data observability platform, Astro Observe. ClickHouse's ability to handle billions of Airflow workflow events with fast query performance and minimal maintenance requirements made it their database of choice. Julian also presented on the same topic at the ClickHouse New York November 2024 meetup.
Upcoming events
We've started announcing our first speakers with just over a month until Open House, The ClickHouse User Conference in San Francisco on May 29.
Kevin Weil (CPO at OpenAI) and Martin Casado (Partner at Andreessen Horowitz) will join Aaron Katz (CEO at ClickHouse) for a fireside chat about the future of data infrastructure for AI at scale.
Lukas Biewald (Founder and CEO at Weights & Biases) will also join us to discuss the future of AI and the role high-performance databases like ClickHouse play in powering next-gen AI apps.
Global events
- v25.4 Community Call - April 22
Free training
- ClickHouse Fundamentals - Virtual - April 22
- In-Person BigQuery to ClickHouse - Jakarta - April 22
- Using ClickHouse for Observability - May 7
- ClickHouse Fundamentals - Virtual - May 13
- In-Person ClickHouse Developer Fast Track - Munich - May 14
- ClickHouse Developer Training - Virtual - May 21
Events in AMER
- ClickHouse Meetup in Denver - April 23
Events in EMEA
- AWS Summit 2025, London - April 30
- AWS Summit 2025, Poland - May 6
- ClickHouse Meetup in London - May 13
- ClickHouse Happy Hour Munich - May 14
- ClickHouse Istanbul Meetup - May 14
Events in APAC
- ClickHouse Jakarta Meetup - AI Night! - April 22
- AWS Summit Bengaluru - May 7-8
- AWS Summit Hong Kong - May 8
- Data Engineering Summit, Bengaluru - May 15-16
25.3 release
My favorite feature in the 25.3 release is the query condition cache, which caches the ranges of data that match a WHERE
clause. This is useful in dashboarding or observability use cases where multiple queries have a different overall shape but the same filtering condition.
This release adds read support for the AWS Glue and Unity catalogs, new array functions, and automatic parallelization for external data. Finally, the JSON data type is now production-ready!
Six Months with ClickHouse at CloudQuery (The Good, The Bad, and the Unexpected)
Herman Schaaf and Joe Karlsson shared their six-month experience using ClickHouse as their database backend for cloud asset inventory.
Their key insights include understanding when to use JOINs versus dictionaries for reference data, the critical importance of properly designing sorting keys for query performance, limitations of Materialized Views that led them to create custom snapshot tables, and ClickHouse's surprising versatility for logging and observability data.
Despite some challenges, CloudQuery found that ClickHouse delivered on its promises of speed and scalability for its cloud governance platform.
A Year of Rust in ClickHouse
Alexey Milovidov, ClickHouse's CTO, has written a blog about integrating Rust into their codebase.
The initiative began with small components like BLAKE3 and PRQL (with contributions from community members) before implementing more practical features such as Delta Lake support.
Throughout this journey, numerous technical challenges have been tackled, including build system integration, sanitizer compatibility, cross-compilation problems, and symbol size bloat.
Scalable EDR Advanced Agent Analytics with ClickHouse
Huntress has implemented ClickHouse to enhance its EDR analytics capabilities. Using ClickHouse has allowed them to process billions of data points daily across millions of endpoints while maintaining rapid query performance.
The implementation leverages AggregatingMergeTree and Materialized Views to monitor agent health and stability efficiently.
ClickHouse acquires HyperDX: The future of open-source observability
ClickHouse has acquired HyperDX, a fully open-source observability platform built on ClickHouse.
This acquisition strengthens our ability to provide developers and enterprises with efficient and scalable observability solutions. By combining HyperDX's UI and session replay capabilities with ClickHouse's database performance, we're enhancing our open-source observability offerings.
Make Before Break - Faster Scaling Mechanics for ClickHouse Cloud
Jayme Bird and Manish Gill wrote a blog post about the "Make Before Break" (MBB) scaling approach introduced in ClickHouse Cloud to address limitations in the previous scaling method.
Initially, ClickHouse Cloud used a single StatefulSet to manage all server replicas, requiring rolling restarts that could take hours during scaling. The MBB approach creates new pods with desired resources before removing old ones, eliminating downtime during scaling operations.
This required developing a MultiSTS architecture where each pod is managed by its own StatefulSet and custom Kubernetes controllers to orchestrate migrations. Despite technical challenges, the team successfully migrated their entire fleet to this new architecture, significantly improving scaling times and reducing customer disruptions.
Quick reads
- Hossein Kohzadi has written a blog post explaining how to use ClickHouse in .NET applications.
- Roman Ianvarev introduces QuerySight, a command-line tool that analyzes ClickHouse query logs and provides intelligent optimization recommendations for your dbt project.
- Raj Kantaria briefly introduces Anthropic’s Model Context Protocol, using the ClickHouse MCP Server as an example.
- Tom Schreiber walks us through accelerating ClickHouse queries on JSON data with the BlueSky dataset.
- Keshav Agrawal builds a Real-time data pipeline with Go, Kafka, ClickHouse, and Apache Superset.
Post of the month
My favorite post this month was by Andi Pangeran, who’s been trying out Clickhouse’s support for reading from Delta Lake catalogs.