Documentation Index
Fetch the complete documentation index at: https://pg-stat-ch.clickhouse.com/llms.txt
Use this file to discover all available pages before exploring further.
The pg_stat_ch ClickHouse schema includes four materialized views that aggregate events_raw data automatically. All views are populated by ClickHouse as events arrive — no manual refresh is needed.
Overview
| View | Purpose | Granularity | Retention |
|---|
events_recent_1h | Real-time debugging | Per event | 1 hour TTL |
query_stats_5m | Query performance dashboards | 5-minute buckets | Unbounded |
db_app_user_1m | Load by application/user | 1-minute buckets | Unbounded |
errors_recent | Error investigation | Per error event | 7-day TTL |
The typical workflow is: use materialized views to find patterns, then drill into events_raw for specific events.
events_recent_1h
A copy of events_raw with a 1-hour TTL. ClickHouse automatically deletes events older than 1 hour from this table.
Use cases:
- “What queries ran in the last 5 minutes?”
- Real-time dashboards with sub-second refresh
- Quick debugging of ongoing issues
SELECT
ts_start,
db,
cmd_type,
duration_us / 1000 AS ms,
rows,
substring(query, 1, 100) AS query_preview
FROM pg_stat_ch.events_recent_1h
WHERE ts_start > now() - INTERVAL 5 MINUTE
ORDER BY ts_start DESC
LIMIT 50;
Because this table has a 1-hour TTL, queries against it are fast even if events_raw has weeks of data.
query_stats_5m
Pre-aggregated query statistics in 5-minute buckets. This is the primary view for dashboards and performance monitoring.
Use cases:
- QPS trends over time
- Latency percentiles (p95/p99) per query
- Identifying slow queries and regressions
- Capacity planning
Schema
| Column | Type | Description |
|---|
bucket | DateTime | 5-minute bucket start time |
db | LowCardinality(String) | Database name |
query_id | Int64 | Normalized query identifier |
cmd_type | LowCardinality(String) | Command type |
calls_state | AggregateFunction(count) | Call count |
duration_sum_state | AggregateFunction(sum, UInt64) | Total duration (microseconds) |
duration_min_state | AggregateFunction(min, UInt64) | Minimum duration |
duration_max_state | AggregateFunction(max, UInt64) | Maximum duration |
duration_q_state | AggregateFunction(quantilesTDigest(0.95, 0.99), UInt64) | Latency percentile digest |
rows_sum_state | AggregateFunction(sum, UInt64) | Total rows |
shared_hit_sum_state | AggregateFunction(sum, Int64) | Total buffer hits |
shared_read_sum_state | AggregateFunction(sum, Int64) | Total buffer reads |
Querying aggregate states
This view uses ClickHouse’s -State / -Merge pattern. Columns store intermediate aggregate states that must be finalized with the corresponding -Merge function:
| To get | Use |
|---|
| Call count | countMerge(calls_state) |
| Total duration | sumMerge(duration_sum_state) |
| Min duration | minMerge(duration_min_state) |
| Max duration | maxMerge(duration_max_state) |
| p95 and p99 | quantilesTDigestMerge(0.95, 0.99)(duration_q_state) |
| Total rows | sumMerge(rows_sum_state) |
| Buffer hits | sumMerge(shared_hit_sum_state) |
| Buffer reads | sumMerge(shared_read_sum_state) |
This pattern allows correct re-aggregation across multiple 5-minute buckets. For example, a 1-hour p99 is computed correctly from 12 five-minute digests, rather than being an average of 12 p99 values.
Top queries by p99 latency
SELECT
query_id,
cmd_type,
countMerge(calls_state) AS calls,
round(sumMerge(duration_sum_state) / countMerge(calls_state) / 1000, 2) AS avg_ms,
round(quantilesTDigestMerge(0.95, 0.99)(duration_q_state)[1] / 1000, 2) AS p95_ms,
round(quantilesTDigestMerge(0.95, 0.99)(duration_q_state)[2] / 1000, 2) AS p99_ms
FROM pg_stat_ch.query_stats_5m
WHERE bucket >= now() - INTERVAL 1 HOUR
GROUP BY query_id, cmd_type
ORDER BY p99_ms DESC
LIMIT 10;
QPS over time
Each bucket is 5 minutes (300 seconds), so divide by 300 for per-second rate:
SELECT
bucket,
countMerge(calls_state) / 300 AS qps
FROM pg_stat_ch.query_stats_5m
WHERE bucket >= now() - INTERVAL 24 HOUR
GROUP BY bucket
ORDER BY bucket;
Cache hit ratio trend
SELECT
bucket,
round(100 * sumMerge(shared_hit_sum_state) /
(sumMerge(shared_hit_sum_state) + sumMerge(shared_read_sum_state) + 1), 2) AS hit_ratio_pct
FROM pg_stat_ch.query_stats_5m
WHERE bucket >= now() - INTERVAL 24 HOUR
GROUP BY bucket
ORDER BY bucket;
db_app_user_1m
Load breakdown by database, application, user, and command type in 1-minute buckets.
Use cases:
- Which application is generating the most load?
- Per-tenant or per-user resource tracking
- Error rates by application
- Identifying misbehaving clients
Schema
| Column | Type | Description |
|---|
bucket | DateTime | 1-minute bucket start time |
db | LowCardinality(String) | Database name |
app | LowCardinality(String) | Application name |
username | LowCardinality(String) | PostgreSQL username |
cmd_type | LowCardinality(String) | Command type |
calls_state | AggregateFunction(count) | Query count |
duration_sum_state | AggregateFunction(sum, UInt64) | Total duration (microseconds) |
duration_q_state | AggregateFunction(quantilesTDigest(0.95, 0.99), UInt64) | Latency percentile digest |
errors_sum_state | AggregateFunction(sum, UInt64) | Error count |
Load by application
SELECT
app,
countMerge(calls_state) AS total_queries,
round(sumMerge(duration_sum_state) / 1000000, 2) AS total_seconds,
round(quantilesTDigestMerge(0.95, 0.99)(duration_q_state)[2] / 1000, 2) AS p99_ms,
sumMerge(errors_sum_state) AS errors
FROM pg_stat_ch.db_app_user_1m
WHERE bucket >= now() - INTERVAL 24 HOUR
GROUP BY app
ORDER BY total_seconds DESC;
Error rate by database and user
SELECT
db,
username,
countMerge(calls_state) AS queries,
sumMerge(errors_sum_state) AS errors,
round(100 * sumMerge(errors_sum_state) / countMerge(calls_state), 2) AS error_pct
FROM pg_stat_ch.db_app_user_1m
WHERE bucket >= now() - INTERVAL 1 HOUR
GROUP BY db, username
HAVING errors > 0
ORDER BY error_pct DESC;
errors_recent
Recent error events with a 7-day TTL. This view filters events_raw to only rows where err_elevel > 0.
Use cases:
- “What errors happened in the last hour?”
- Incident investigation with query context
- Error rate alerting
- Identifying recurring error patterns by SQLSTATE
Schema
This view stores the full event row (not aggregated):
ts_start, db, username, app, client_addr, pid, query_id, err_sqlstate, err_elevel, err_message, query
Recent errors with context
SELECT
ts_start,
db,
username,
app,
err_sqlstate,
err_message,
substring(query, 1, 200) AS query_preview
FROM pg_stat_ch.errors_recent
WHERE ts_start > now() - INTERVAL 1 HOUR
ORDER BY ts_start DESC
LIMIT 100;
Error breakdown by SQLSTATE
SELECT
err_sqlstate,
count() AS occurrences,
uniq(query_id) AS unique_queries,
any(err_message) AS sample_message
FROM pg_stat_ch.errors_recent
WHERE ts_start > now() - INTERVAL 24 HOUR
GROUP BY err_sqlstate
ORDER BY occurrences DESC;
Custom views
You can create your own materialized views on top of events_raw for project-specific analytics. For example, a view that tracks queries by table name or a view that computes per-minute error rates for alerting.
See the ClickHouse documentation on materialized views for details on creating and managing views.