Achieving atomic inserts and multi-table consistency in ClickHouse Cloud
Problem
ClickHouse Cloud does not support multi-statement transactions in the traditional RDBMS sense. This creates two common challenges:
- Single-table atomicity for bulk loads: A common approach is to insert into temporary partial keys, then copy records to the actual key and delete the temporary records. This approach performs poorly — particularly the delete step, which can consume over 90% of the total operation time.
- Multi-table consistency: When a pipeline loads Table A successfully but fails on Table B, Table A is already committed and cannot be rolled back. Analysts querying across both tables see data that is out of sync.
Background
ClickHouse guarantees atomicity at the single-insert, single-partition level: if an INSERT succeeds, all rows in that block are visible; if it fails, none are. However, there is no built-in mechanism to atomically commit data across multiple inserts or multiple tables.
The partition manipulation commands (MOVE PARTITION TO TABLE, REPLACE PARTITION, ATTACH PARTITION FROM) operate at the metadata level when the source and destination tables share the same storage policy.
This means they execute almost instantaneously regardless of data size, making them ideal building blocks for atomic-swap patterns.
Recommended solution
Instead of inserting directly into production tables and attempting to clean up on failure, use dedicated staging tables as a landing zone. After validating the data, use partition-level operations to atomically promote the data into production.
Step-by-step for single-table atomicity
Create a staging table
Use the same schema, partition key, ORDER BY, and storage policy as the production table.
Insert data into the staging table
Run your insert against the staging table instead of the production table.
If the insert fails, truncate and retry
Truncate the staging table and run the load again. No production data has been affected.
If the insert succeeds, move the partitions to production
To move the data into production and remove it from the staging table, use MOVE PARTITION.
Alternatively, copy the data into an existing partition in production with ATTACH PARTITION.
Both operations are metadata-level changes on the same storage policy and complete almost instantaneously.
Clean up the staging table
Once all partitions have been moved, truncate the staging table to leave it empty for the next load.
Multi-table consistency
The same pattern solves pipelines where two or more tables must all be loaded before any of them becomes visible to analysts. Load each table's data into its own staging table and validate all of them, then run the partition moves together. Because each move is a near-instant metadata operation, the window in which the tables disagree shrinks from the duration of the full load to the time it takes to swap partitions.
Requirements and constraints
For MOVE PARTITION TO TABLE, REPLACE PARTITION, and ATTACH PARTITION FROM, the source and destination tables must have:
- The same column structure
- The same partition key,
ORDER BYkey, and primary key - The same storage policy
- The destination table must include all indices and projections from the source table
Example
You can try it at fiddle:
References
- Manipulating Partitions and Parts
- To read more about this strategy, see the blog post Supercharging your large ClickHouse data loads - Part 3: Making a large data load resilient.