PostgreSQL Partitioning Lessons from 2 Billion Rows
What we learned implementing declarative partitioning on a high-write PostgreSQL cluster processing advertising events at scale.
When your main events table crosses a billion rows, queries that were once instant start to crawl. Vacuum takes longer. Index bloat creeps up. You know it is time to partition, but the devil is in the details.
This post covers the lessons we learned partitioning a PostgreSQL cluster that ingests roughly 50,000 events per second for an advertising platform.
Choosing the Partition Key
The most important decision is your partition key. For time-series event data, the obvious choice is a timestamp column. We partition by day using PostgreSQL’s declarative partitioning:
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY,
occurred_at timestamptz NOT NULL,
event_type smallint NOT NULL,
campaign_id bigint NOT NULL,
payload jsonb
) PARTITION BY RANGE (occurred_at);
Daily partitions give us a good balance: each partition is large enough to be worth the overhead, but small enough that dropping old data is instant — just DROP TABLE on the expired partition.
Automating Partition Creation
Do not create partitions manually. We use a cron job that runs daily and pre-creates partitions 7 days ahead:
SELECT create_partition(
'events',
current_date + i * interval '1 day',
current_date + (i + 1) * interval '1 day'
) FROM generate_series(0, 6) AS i;
If partition creation fails, inserts fail. Pre-creating gives you a buffer to notice and fix problems.
Index Strategy
Every partition gets its own indexes. This is both a blessing and a curse. The blessing: index builds are smaller and faster. The curse: you need to define them as part of the partitioned table, and adding a new index means building it on every existing partition.
We keep indexes minimal:
- Primary key on
(occurred_at, id)— partition key must be in the primary key - A composite index on
(campaign_id, occurred_at)for the most common query pattern - No other indexes unless a specific query demands it
Vacuum Considerations
Partitioning dramatically improves vacuum performance. Instead of vacuuming one massive table, autovacuum works on individual partitions. Recent partitions (actively being written to) get vacuumed frequently. Old partitions rarely need vacuuming since they are read-only.
We tune autovacuum per-partition for the “hot” partition:
ALTER TABLE events_20260301 SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_analyze_scale_factor = 0.005
);
Query Planning
The query planner needs help to efficiently prune partitions. Always include the partition key in your WHERE clause. A query like SELECT * FROM events WHERE campaign_id = 123 will scan every partition. Add a time range and the planner skips irrelevant partitions entirely:
SELECT * FROM events
WHERE campaign_id = 123
AND occurred_at >= '2026-03-01'
AND occurred_at < '2026-03-02';
The difference can be orders of magnitude in query time.
Results
After migrating to daily partitions:
- Write throughput increased by 15% (smaller indexes to update per insert)
- Vacuum time dropped from hours to minutes per partition
- Query latency for time-bounded queries improved 10-50x depending on the time range
- Data retention became trivial — dropping a month of data takes seconds, not hours
Partitioning is not a silver bullet, but for high-volume time-series workloads in PostgreSQL, it is one of the highest-impact optimizations available.