ClickHouse schema design and optimization. TRIGGERS - ClickHouse schema, compression codecs, MergeTree, ORDER BY tuning, partition key.
Prescriptive schema design, compression selection, and performance optimization for ClickHouse (v24.4+). Covers both ClickHouse Cloud (SharedMergeTree) and self-hosted (ReplicatedMergeTree) deployments.
Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.
Use this skill when:
Follow this sequence when designing or reviewing ClickHouse schemas:
references/schema-documentation.mdThe ORDER BY clause is the most critical decision in ClickHouse schema design.
Rules:
tenant_id before timestamp)Example:
-- Correct: Low cardinality first, 4 columns
CREATE TABLE trades (
exchange LowCardinality(String),
symbol LowCardinality(String),
timestamp DateTime64(3),
trade_id UInt64,
price Float64,
quantity Float64
) ENGINE = MergeTree()
ORDER BY (exchange, symbol, timestamp, trade_id);
-- Wrong: High cardinality first (10x slower queries)
ORDER BY (trade_id, timestamp, symbol, exchange);
| Column Type | Default Codec | Read-Heavy Alternative | Example |
|---|---|---|---|
| DateTime/DateTime64 | CODEC(DoubleDelta, ZSTD) |
CODEC(DoubleDelta, LZ4) |
timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD) |
| Float prices/gauges | CODEC(Gorilla, ZSTD) |
CODEC(Gorilla, LZ4) |
price Float64 CODEC(Gorilla, ZSTD) |
| Integer counters | CODEC(T64, ZSTD) |
— | count UInt64 CODEC(T64, ZSTD) |
| Slowly changing integers | CODEC(Delta, ZSTD) |
CODEC(Delta, LZ4) |
version UInt32 CODEC(Delta, ZSTD) |
| String (low cardinality) | LowCardinality(String) |
— | status LowCardinality(String) |
| General data | CODEC(ZSTD(3)) |
CODEC(LZ4) |
Default compression level 3 |
When to use LZ4 over ZSTD: LZ4 provides 1.76x faster decompression. Use LZ4 for read-heavy workloads with monotonic sequences (timestamps, counters). Use ZSTD (default) when compression ratio matters or data patterns are unknown.
Note on codec combinations:
Delta/DoubleDelta + Gorilla combinations are blocked by default (allow_suspicious_codecs) because Gorilla already performs implicit delta compression internally—combining them is redundant, not dangerous. A historical corruption bug (PR #45615, Jan 2023) was fixed, but the blocking remains as a best practice guardrail.
Use each codec family independently for its intended data type:
-- Correct usage
price Float64 CODEC(Gorilla, ZSTD) -- Floats: use Gorilla
timestamp DateTime64 CODEC(DoubleDelta, ZSTD) -- Timestamps: use DoubleDelta
timestamp DateTime64 CODEC(DoubleDelta, LZ4) -- Read-heavy: use LZ4
PARTITION BY is for data lifecycle management, NOT query optimization.
Rules:
Example:
-- Correct: Monthly partitions for TTL management
PARTITION BY toYYYYMM(timestamp)
-- Wrong: Daily partitions (too many parts)
PARTITION BY toYYYYMMDD(timestamp)
-- Wrong: High-cardinality partition key
PARTITION BY user_id
| Pattern | Severity | Modern Status | Fix |
|---|---|---|---|
| Too many parts (>300/partition) | Critical | Still critical | Reduce partition granularity |
| Small batch inserts (<1000) | Critical | Still critical | Batch to 10k-100k rows |
| High-cardinality first ORDER BY | Critical | Still critical | Reorder: lowest cardinality first |
| No memory limits | High | Still critical | Set max_memory_usage |
| Denormalization overuse | High | Still critical | Use dictionaries + materialized views |
| Large JOINs | Medium | 180x improved | Still avoid for ultra-low-latency |
| Mutations (UPDATE/DELETE) | Medium | 1700x improved | Use lightweight UPDATEs (v24.4+); see DELETE Strategy Guide below |
Choose the right DELETE strategy based on scope. Ranked fastest to slowest:
| Strategy | Syntax | Speed | Use When |
|---|---|---|---|
DROP PARTITION |
ALTER TABLE t DROP PARTITION (key1, key2, keyN) |
Instant (metadata-only) | Purge entire partition ranges (months, corrupt data, test data) |
DELETE IN PARTITION |
ALTER TABLE t DELETE IN PARTITION (...) WHERE condition |
Fast (scans 1 partition) | Targeted row removal within a known partition |
ALTER TABLE DELETE |
ALTER TABLE t DELETE WHERE condition |
Slow (scans all parts) | Fallback when partition is unknown |
DELETE FROM (lightweight) |
DELETE FROM t WHERE condition |
Variable | ANTI-PATTERN for write pipelines — see warning below |
Anti-pattern: Lightweight DELETE FROM before INSERT
DELETE FROM sets _row_exists=0 masks instead of physically removing rows. These ghost rows:
FINAL as phantom dataUse DELETE FROM only for: ad-hoc data correction where ghost rows don't matter (analytics cleanup, dev/test). Never use in write pipelines where INSERT follows DELETE.
All DELETE mutations should use: SETTINGS mutations_sync = 1 to block until completion (prevents INSERT-DELETE race conditions).
Partition-aware DELETE tip: If your partition key includes the columns you're filtering on (e.g., PARTITION BY (symbol, threshold, toYYYYMM(timestamp))), use DELETE IN PARTITION to scope the scan to a single partition instead of scanning all parts.
| Deployment | Engine | Use Case |
|---|---|---|
| ClickHouse Cloud | SharedMergeTree |
Default for cloud deployments |
| Self-hosted cluster | ReplicatedMergeTree |
Multi-node with replication |
| Self-hosted single | MergeTree |
Single-node development/testing |
Cloud (SharedMergeTree):
CREATE TABLE trades (...)
ENGINE = SharedMergeTree('/clickhouse/tables/{shard}/trades', '{replica}')
ORDER BY (exchange, symbol, timestamp);
Self-hosted (ReplicatedMergeTree):
CREATE TABLE trades (...)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/trades', '{replica}')
ORDER BY (exchange, symbol, timestamp);
This skill is the hub for ClickHouse-related tasks. When the user's needs extend beyond schema design, invoke the related skills below.
| User Need | Invoke Skill | Trigger Phrases |
|---|---|---|
| Create database users, manage permissions | devops-tools:clickhouse-cloud-management |
"create user", "GRANT", "permissions", "credentials" |
| Configure DBeaver, generate connection JSON | devops-tools:clickhouse-pydantic-config |
"DBeaver", "client config", "connection setup" |
| Validate schema contracts against live database | quality-tools:schema-e2e-validation |
"validate schema", "Earthly E2E", "schema contract" |
clickhouse-cloud-management (if cloud credentials needed)clickhouse-pydantic-config (generate DBeaver JSON)schema-e2e-validation (CI/CD schema contracts)User: "I need to design a trades table for ClickHouse Cloud and set up DBeaver to query it."
Expected behavior:
clickhouse-cloud-management for creating database userclickhouse-pydantic-config for DBeaver configurationCreate alternative sort orders that ClickHouse automatically selects:
ALTER TABLE trades ADD PROJECTION trades_by_symbol (
SELECT * ORDER BY symbol, timestamp
);
ALTER TABLE trades MATERIALIZE PROJECTION trades_by_symbol;
Pre-compute aggregations for dashboard queries:
CREATE MATERIALIZED VIEW trades_hourly_mv
ENGINE = SummingMergeTree()
ORDER BY (exchange, symbol, hour)
AS SELECT
exchange,
symbol,
toStartOfHour(timestamp) AS hour,
sum(quantity) AS total_volume,
count() AS trade_count
FROM trades
GROUP BY exchange, symbol, hour;
Replace JOINs with O(1) dictionary lookups for large-scale star schemas:
When to use dictionaries (v24.4+):
When JOINs are sufficient (v24.4+):
Benchmark context: 6.6x speedup measured on Star Schema Benchmark (1.4B rows).
CREATE DICTIONARY symbol_info (
symbol String,
name String,
sector String
)
PRIMARY KEY symbol
SOURCE(CLICKHOUSE(TABLE 'symbols'))
LAYOUT(FLAT()) -- Best for <500k entries with monotonic keys
LIFETIME(3600);
-- Use in queries (O(1) lookup)
SELECT
symbol,
dictGet('symbol_info', 'name', symbol) AS symbol_name
FROM trades;
Execute comprehensive schema audit:
clickhouse-client --multiquery < scripts/schema-audit.sql
The audit script checks:
| Reference | Content |
|---|---|
references/schema-design-workflow.md |
Complete workflow with examples |
references/compression-codec-selection.md |
Decision tree + benchmarks |
references/anti-patterns-and-fixes.md |
13 deadly sins + v24.4+ status |
references/audit-and-diagnostics.md |
Query interpretation guide |
references/idiomatic-architecture.md |
Parameterized views, dictionaries, dedup |
references/schema-documentation.md |
COMMENT patterns + naming for AI understanding |
references/cache-schema-evolution.md |
Cache invalidation + schema evolution patterns |
Use clickhouse-connect (official) for all Python integrations.
# ✅ RECOMMENDED: clickhouse-connect (official, HTTP)
import clickhouse_connect
client = clickhouse_connect.get_client(
host='localhost',
port=8123, # HTTP port
username='default',
password=''
)
result = client.query("SELECT * FROM trades LIMIT 1000")
df = client.query_df("SELECT * FROM trades") # Pandas integration
clickhouse-driver| Factor | clickhouse-connect | clickhouse-driver |
|---|---|---|
| Maintainer | ClickHouse Inc. | Solo developer |
| Weekly commits | Yes (active) | Sparse (months) |
| Open issues | 41 (addressed) | 76 (accumulating) |
| Downloads/week | 2.7M | 1.5M |
| Bus factor risk | Low (company) | High (1 person) |
Do NOT use clickhouse-driver despite its ~26% speed advantage for large exports. The maintenance risk outweighs performance gains:
Exception: Only consider clickhouse-driver if you have extreme performance requirements (exporting millions of rows) AND accept the maintenance risk.
Every ClickHouse table and column MUST have a COMMENT that fully documents its meaning, computation method, and constraints. The COMMENT is the SSoT — no external document, skill, or wiki supersedes it.
DESCRIBE table, SHOW CREATE TABLE, system.columnsALTER TABLE t COMMENT COLUMN session_label
'STRICT session label. 8 values: sydney_only, tokyo_only, ...
Only set when ENTIRE bar (open→close) falls within one session.
cross_session = bar spans boundary. Use WHERE is_pure_session=1.
GitHub: https://github.com/org/repo/issues/54
Source: scripts/populate-sessions/populate_v3.py';
NEVER create a ClickHouse column without a COMMENT. A column without documentation is a column that will be misused.
| Skill | Purpose |
|---|---|
devops-tools:clickhouse-cloud-management |
User/permission management |
devops-tools:clickhouse-pydantic-config |
DBeaver connection generation |
quality-tools:schema-e2e-validation |
YAML schema contracts |
quality-tools:multi-agent-e2e-validation |
Database migration validation |
| Issue | Cause | Solution |
|---|---|---|
| Too many parts | Over-partitioned | Reduce partition granularity (monthly not daily) |
| Slow queries | Wrong ORDER BY order | Put lowest cardinality columns first |
| High memory usage | No memory limits set | Configure max_memory_usage setting |
| Codec error on Delta+Gorilla | Suspicious codec combination | Use each codec family independently |
| Projection not used | Optimizer chose different plan | Check EXPLAIN to verify projection selection |
| Dictionary stale | Lifetime expired | Increase LIFETIME or trigger refresh |
| Replication lag | Part merges falling behind | Check merge_tree settings, add resources |
| INSERT too slow | Small batch sizes | Batch to 10k-100k rows per INSERT |
After this skill completes, check before closing:
Only update if the issue is real and reproducible — not speculative.
