Skip to main content
Skip to main content
Edit this page

Selecting an Insert Strategy

Efficient data ingestion forms the basis of high-performance ClickHouse deployments. Selecting the right insert strategy can dramatically impact throughput, cost, and reliability. This section outlines best practices, tradeoffs, and configuration options to help you make the right decision for your workload.

note

The following assumes you are pushing data to ClickHouse via a client. If you are pulling data into ClickHouse e.g. using built in table functions such as s3 and gcs, we recommend our guide "Optimizing for S3 Insert and Read Performance".

Synchronous inserts by default

By default, inserts into ClickHouse are synchronous. Each insert query immediately creates a storage part on disk, including metadata and indexes.

Use synchronous inserts if you can batch the data client side

If not, see Asynchronous inserts below.

We briefly review ClickHouse's MergeTree insert mechanics below:

Client-side steps

For optimal performance, data must be ① batched, making batch size the first decision.

ClickHouse stores inserted data on disk, ordered by the table's primary key column(s). The second decision is whether to ② pre-sort the data before transmission to the server. If a batch arrives pre-sorted by primary key column(s), ClickHouse can skip the ⑨ sorting step, speeding up ingestion.

If the data to be ingested has no predefined format, the key decision is choosing a format. ClickHouse supports inserting data in over 70 formats. However, when using the ClickHouse command-line client or programming language clients, this choice is often handled automatically. If needed, this automatic selection can also be overridden explicitly.

The next major decision is ④ whether to compress data before transmission to the ClickHouse server. Compression reduces transfer size and improves network efficiency, leading to faster data transfers and lower bandwidth usage, especially for large datasets.

The data is ⑤ transmitted to a ClickHouse network interface—either the native or HTTP interface (which we compare later in this post).

Server-side steps

After ⑥ receiving the data, ClickHouse ⑦ decompresses it if compression was used, then ⑧ parses it from the originally sent format.

Using the values from that formatted data and the target table's DDL statement, ClickHouse ⑨ builds an in-memory block in the MergeTree format, ⑩ sorts rows by the primary key columns if they are not already pre-sorted, ⑪ creates a sparse primary index, ⑫ applies per-column compression, and ⑬ writes the data as a new ⑭ data part to disk.

Batch inserts if synchronous

The above mechanics illustrate a constant overhead regardless of the insert size, making batch size the single most important optimization for ingest throughput. Batching inserts reduce the overhead as a proportion of total insert time and improves processing efficiency.

We recommend inserting data in batches of at least 1,000 rows, and ideally between 10,000–100,000 rows. Fewer, larger inserts reduce the number of parts written, minimize merge load, and lower overall system resource usage.

For a synchronous insert strategy to be effective this client-side batching is required.

If you're unable to batch data client-side, ClickHouse supports asynchronous inserts that shift batching to the server (see).

tip

Regardless of the size of your inserts, we recommend keeping the number of insert queries around one insert query per second. The reason for that recommendation is that the created parts are merged to larger parts in the background (in order to optimize your data for read queries), and sending too many insert queries per second can lead to situations where the background merging can't keep up with the number of new parts. However, you can use a higher rate of insert queries per second when you use asynchronous inserts (see asynchronous inserts).

Ensure idempotent retries

Synchronous inserts are also idempotent. When using MergeTree engines, ClickHouse will deduplicate inserts by default. This protects against ambiguous failure cases, such as:

  • The insert succeeded but the client never received an acknowledgment due to a network interruption.
  • The insert failed server-side and timed out.

In both cases, it's safe to retry the insert - as long as the batch contents and order remain identical. For this reason, it's critical that clients retry consistently, without modifying or reordering data.

Choose the right insert target

For sharded clusters, you have two options:

  • Insert directly into a MergeTree or ReplicatedMergeTree table. This is the most efficient option when the client can perform load balancing across shards. With internal_replication = true, ClickHouse handles replication transparently.
  • Insert into a Distributed table. This allows clients to send data to any node and let ClickHouse forward it to the correct shard. This is simpler but slightly less performant due to the extra forwarding step. internal_replication = true is still recommended.

In ClickHouse Cloud all nodes read and write to the same single shard. Inserts are automatically balanced across nodes. Users can simply send inserts to the exposed endpoint.

Choose the right format

Choosing the right input format is crucial for efficient data ingestion in ClickHouse. With over 70 supported formats, selecting the most performant option can significantly impact insert speed, CPU and memory usage, and overall system efficiency.

While flexibility is useful for data engineering and file-based imports, applications should prioritize performance-oriented formats:

  • Native format (recommended): Most efficient. Column-oriented, minimal parsing required server-side. Used by default in Go and Python clients.
  • RowBinary: Efficient row-based format, ideal if columnar transformation is hard client-side. Used by the Java client.
  • JSONEachRow: Easy to use but expensive to parse. Suitable for low-volume use cases or quick integrations.

Use compression

Compression plays a critical role in reducing network overhead, speeding up inserts, and lowering storage costs in ClickHouse. Used effectively, it enhances ingestion performance without requiring changes to data format or schema.

Compressing insert data reduces the size of the payload sent over the network, minimizing bandwidth usage and accelerating transmission.

For inserts, compression is especially effective when used with the Native format, which already matches ClickHouse's internal columnar storage model. In this setup, the server can efficiently decompress and directly store the data with minimal transformation.

Use LZ4 for speed, ZSTD for compression ratio

ClickHouse supports several compression codecs during data transmission. Two common options are:

  • LZ4: Fast and lightweight. It reduces data size significantly with minimal CPU overhead, making it ideal for high-throughput inserts and default in most ClickHouse clients.
  • ZSTD: Higher compression ratio but more CPU-intensive. It's useful when network transfer costs are high—such as in cross-region or cloud provider scenarios—though it increases client-side compute and server-side decompression time slightly.

Best practice: Use LZ4 unless you have constrained bandwidth or incur data egress costs - then consider ZSTD.

note

In tests from the FastFormats benchmark, LZ4-compressed Native inserts reduced data size by more than 50%, cutting ingestion time from 150s to 131s for a 5.6 GiB dataset. Switching to ZSTD compressed the same dataset down to 1.69 GiB, but increased server-side processing time slightly.

Compression reduces resource usage

Compression not only reduces network traffic—it also improves CPU and memory efficiency on the server. With compressed data, ClickHouse receives fewer bytes and spends less time parsing large inputs. This benefit is especially important when ingesting from multiple concurrent clients, such as in observability scenarios.

The impact of compression on CPU and memory is modest for LZ4, and moderate for ZSTD. Even under load, server-side efficiency improves due to the reduced data volume.

Combining compression with batching and an efficient input format (like Native) yields the best ingestion performance.

When using the native interface (e.g. clickhouse-client), LZ4 compression is enabled by default. You can optionally switch to ZSTD via settings.

With the HTTP interface, use the Content-Encoding header to apply compression (e.g. Content-Encoding: lz4). The entire payload must be compressed before sending.

Pre-sort if low cost

Pre-sorting data by primary key before insertion can improve ingestion efficiency in ClickHouse, particularly for large batches.

When data arrives pre-sorted, ClickHouse can skip or simplify the internal sorting step during part creation, reducing CPU usage and accelerating the insert process. Pre-sorting also improves compression efficiency, since similar values are grouped together - enabling codecs like LZ4 or ZSTD to achieve a better compression ratio. This is especially beneficial when combined with large batch inserts and compression, as it reduces both the processing overhead and the amount of data transferred.

That said, pre-sorting is an optional optimization—not a requirement. ClickHouse sorts data highly efficiently using parallel processing, and in many cases, server-side sorting is faster or more convenient than pre-sorting client-side.

We recommend pre-sorting only if the data is already nearly ordered or if client-side resources (CPU, memory) are sufficient and underutilized. In latency-sensitive or high-throughput use cases, such as observability, where data arrives out of order or from many agents, it's often better to skip pre-sorting and rely on ClickHouse's built-in performance.

Asynchronous inserts

Asynchronous inserts in ClickHouse provide a powerful alternative when client-side batching isn't feasible. This is especially valuable in observability workloads, where hundreds or thousands of agents send data continuously - logs, metrics, traces - often in small, real-time payloads. Buffering data client-side in these environments increases complexity, requiring a centralized queue to ensure sufficiently large batches can be sent.

note

Sending many small batches in synchronous mode is not recommended, leading to many parts being created. This will lead to poor query performance and "too many part" errors.

Asynchronous inserts shift batching responsibility from the client to the server by writing incoming data to an in-memory buffer, then flushing it to storage based on configurable thresholds. This approach significantly reduces part creation overhead, lowers CPU usage, and ensures ingestion remains efficient - even under high concurrency.

The core behavior is controlled via the async_insert setting.

When enabled (1), inserts are buffered and only written to disk once one of the flush conditions is met:

(1) the buffer reaches a specified size (async_insert_max_data_size) (2) a time threshold elapses (async_insert_busy_timeout_ms) or (3) a maximum number of insert queries accumulate (async_insert_max_query_number).

This batching process is invisible to clients and helps ClickHouse efficiently merge insert traffic from multiple sources. However, until a flush occurs, the data cannot be queried. Importantly, there are multiple buffers per insert shape and settings combination, and in clusters, buffers are maintained per node - enabling fine-grained control across multi-tenant environments. Insert mechanics are otherwise identical to those described for synchronous inserts.

Choosing a Return Mode

The behavior of asynchronous inserts is further refined using the wait_for_async_insert setting.

When set to 1 (the default), ClickHouse only acknowledges the insert after the data is successfully flushed to disk. This ensures strong durability guarantees and makes error handling straightforward: if something goes wrong during the flush, the error is returned to the client. This mode is recommended for most production scenarios, especially when insert failures must be tracked reliably.

Benchmarks show it scales well with concurrency - whether you're running 200 or 500 clients- thanks to adaptive inserts and stable part creation behavior.

Setting wait_for_async_insert = 0 enables "fire-and-forget" mode. Here, the server acknowledges the insert as soon as the data is buffered, without waiting for it to reach storage.

This offers ultra-low-latency inserts and maximal throughput, ideal for high-velocity, low-criticality data. However, this comes with trade-offs: there's no guarantee the data will be persisted, errors may only surface during flush, and it's difficult to trace failed inserts. Use this mode only if your workload can tolerate data loss.

Benchmarks also demonstrate substantial part reduction and lower CPU usage when buffer flushes are infrequent (e.g. every 30 seconds), but the risk of silent failure remains.

Our strong recommendation is to use async_insert=1,wait_for_async_insert=1 if using asynchronous inserts. Using wait_for_async_insert=0 is very risky because your INSERT client may not be aware if there are errors, and also can cause potential overload if your client continues to write quickly in a situation where the ClickHouse server needs to slow down the writes and create some backpressure in order to ensure reliability of the service.

Deduplication and reliability

By default, ClickHouse performs automatic deduplication for synchronous inserts, which makes retries safe in failure scenarios. However, this is disabled for asynchronous inserts unless explicitly enabled (this should not be enabled if you have dependent materialized views - see issue).

In practice, if deduplication is turned on and the same insert is retried - due to, for instance, a timeout or network drop - ClickHouse can safely ignore the duplicate. This helps maintain idempotency and avoids double-writing data. Still, it's worth noting that insert validation and schema parsing happen only during buffer flush - so errors (like type mismatches) will only surface at that point.

Enabling asynchronous inserts

Asynchronous inserts can be enabled for a particular user, or for a specific query:

  • Enabling asynchronous inserts at the user level. This example uses the user default, if you create a different user then substitute that username:

  • You can specify the asynchronous insert settings by using the SETTINGS clause of insert queries:

  • You can also specify asynchronous insert settings as connection parameters when using a ClickHouse programming language client.

    As an example, this is how you can do that within a JDBC connection string when you use the ClickHouse Java JDBC driver for connecting to ClickHouse Cloud :

Choose an interface - HTTP or Native

Native

ClickHouse offers two main interfaces for data ingestion: the native interface and the HTTP interface - each with trade-offs between performance and flexibility. The native interface, used by clickhouse-client and select language clients like Go and C++, is purpose-built for performance. It always transmits data in ClickHouse's highly efficient Native format, supports block-wise compression with LZ4 or ZSTD, and minimizes server-side processing by offloading work such as parsing and format conversion to the client.

It even enables client-side computation of MATERIALIZED and DEFAULT column values, allowing the server to skip these steps entirely. This makes the native interface ideal for high-throughput ingestion scenarios where efficiency is critical.

HTTP

Unlike many traditional databases, ClickHouse also supports an HTTP interface. This, by contrast, prioritizes compatibility and flexibility. It allows data to be sent in any supported format - including JSON, CSV, Parquet, and others - and is widely supported across most ClickHouse clients, including Python, Java, JavaScript, and Rust.

This is often preferable to ClickHouse's native protocol as it allows traffic to be easily switched with load balancers. We expect small differences in insert performance with the native protocol, which incurs a little less overhead.

However, it lacks the native protocol's deeper integration and cannot perform client-side optimizations like materialized value computation or automatic conversion to Native format. While HTTP inserts can still be compressed using standard HTTP headers (e.g. Content-Encoding: lz4), the compression is applied to the entire payload rather than individual data blocks. This interface is often preferred in environments where protocol simplicity, load balancing, or broad format compatibility is more important than raw performance.

For a more detailed description of these interfaces see here.