Skip to main content

Trigger-Based Materialized Tables

Overview

Sable uses trigger-based materialized tables to achieve sub-millisecond query performance on complex analytical views. Instead of recalculating millions of rows on every query, we pre-compute results and cache them in indexed tables that auto-refresh when source data changes.

The Problem

Views like gold.v_dietz_daily were taking 117+ seconds per query. Why?

User queries v_dietz_daily


┌─────────────────────────────────────────┐
│ v_dietz_daily │
│ └── v_capital_daily │
│ └── v_nav_daily │
│ └── v_pnl_daily │
│ └── pnl_daily_v2 │ ← 1M+ rows scanned
└─────────────────────────────────────────┘


117 seconds later... results

Each query recalculates the entire view hierarchy from scratch. No caching, no indexes on intermediate results.

The Solution

Cache the view's output in a regular table. Refresh only what changed.

User queries gold.dietz_daily (TABLE)


┌─────────────────────────────────────────┐
│ Indexed table with 3,310 pre-computed │
│ rows ready to go │
└─────────────────────────────────────────┘


5ms later... results

How It Works: A Concrete Example

Scenario: Daily upload of 1,000 P&L rows for January 15, 2026.

Step 1: Data Arrives

INSERT INTO sable.pnl_daily_v2 (entity_id, pnl_date, daily_pnl, ...)
VALUES
(1, '2026-01-15', 50000, ...),
(1, '2026-01-15', -3000, ...),
(2, '2026-01-15', 12000, ...),
-- ... 997 more rows

Step 2: Trigger Fires

The AFTER INSERT trigger automatically fires. It has access to ALL inserted rows via REFERENCING NEW TABLE AS new_rows:

-- Trigger can see all 1,000 rows at once (not row-by-row)
SELECT * FROM new_rows; -- All 1,000 rows available here

Step 3: Scoped Refresh (The Key Insight)

The trigger doesn't recalculate everything. It asks: "What's the earliest date in this upload?"

SELECT MIN(pnl_date), ARRAY_AGG(DISTINCT entity_id)
INTO v_min_date, v_entity_ids
FROM new_rows;

-- Result: v_min_date = '2026-01-15', v_entity_ids = [1, 2]

Then it only refreshes from that date forward for those entities:

-- Delete stale rows (only Jan 15+ for entities 1 and 2)
DELETE FROM gold.dietz_daily
WHERE entity_id = ANY(ARRAY[1, 2])
AND pnl_date >= '2026-01-15';
-- Deletes maybe 2 rows (today's data if it existed)

-- Recompute from the view (scoped query)
INSERT INTO gold.dietz_daily
SELECT * FROM gold.v_dietz_daily -- View still exists! We use it here.
WHERE entity_id = ANY(ARRAY[1, 2])
AND pnl_date >= '2026-01-15';
-- Inserts 2 fresh rows

Result: Instead of recalculating 3,310 rows, we only touched 2 rows. The view query runs scoped so it's fast (~2 seconds vs 117 seconds).

Step 4: Trigger Chain

Multiple triggers fire in sequence, each depending on the previous:

INSERT INTO pnl_daily_v2 (1,000 rows for Jan 15)


┌─────────────────────────────────────────────────────┐
│ 1. tr_bake_daily_pnl_v2 │
│ Computes daily_pnl column on inserted rows │
│ │ │
│ ▼ │
│ 2. tr_refresh_nav_v2 │
│ Updates nav_daily_v2 for Jan 15+ │
│ │ │
│ ▼ │
│ 3. tr_refresh_capital_v2 │
│ Updates capital_daily_v2 for Jan 15+ │
│ (needs NAV from step 2) │
│ │ │
│ ▼ │
│ 4. tr_refresh_dietz_daily │
│ Updates gold.dietz_daily for Jan 15+ │
│ (needs capital from step 3) │
│ │ │
│ ▼ │
│ 5. tr_refresh_returns_comparison │
│ Updates gold.returns_comparison_entity │
│ (needs dietz from step 4) │
└─────────────────────────────────────────────────────┘


All tables current. Total time: ~5 seconds.

Order matters! Each trigger depends on the previous one's output. If you compute returns before NAV is updated, you get stale data.

The Code

Trigger Function

CREATE OR REPLACE FUNCTION gold.fn_refresh_dietz_daily()
RETURNS TRIGGER AS $func$
DECLARE
v_min_date DATE;
v_entity_ids BIGINT[];
BEGIN
-- 1. Get scope from the uploaded batch
SELECT MIN(pnl_date), ARRAY_AGG(DISTINCT entity_id)
INTO v_min_date, v_entity_ids
FROM new_rows
WHERE entity_id IS NOT NULL;

-- 2. Early exit if nothing to do
IF v_min_date IS NULL THEN
RETURN NULL;
END IF;

-- 3. Delete stale rows (scoped)
DELETE FROM gold.dietz_daily
WHERE entity_id = ANY(v_entity_ids)
AND pnl_date >= v_min_date;

-- 4. Recompute from view (scoped)
INSERT INTO gold.dietz_daily (
entity_id, entity_name, pnl_date, daily_return_pct,
daily_pnl, capital_amount, end_nav, btig_daily_ror,
daily_diff_bps, return_factor
)
SELECT
entity_id, entity_name, pnl_date, daily_return_pct,
daily_pnl, capital_amount, end_nav, btig_daily_ror,
daily_diff_bps, return_factor
FROM gold.v_dietz_daily
WHERE entity_id = ANY(v_entity_ids)
AND pnl_date >= v_min_date;

RETURN NULL;
END;
$func$ LANGUAGE plpgsql;

Statement-Level Trigger

CREATE TRIGGER tr_refresh_dietz_daily
AFTER INSERT ON sable.pnl_daily_v2
REFERENCING NEW TABLE AS new_rows -- Key: batch access to all rows
FOR EACH STATEMENT -- Fires once per INSERT, not per row
EXECUTE FUNCTION gold.fn_refresh_dietz_daily();

Why statement-level? If you upload 1,000 rows:

  • Row-level trigger: fires 1,000 times (slow)
  • Statement-level trigger: fires once with access to all 1,000 rows (fast)

Performance Results

QueryBefore (View)After (Table)Speedup
dietz_daily (single entity, 12mo)4,479 ms4.88 ms917x
returns_comparison_entity (full scan)~117,000 ms0.17 ms688,000x
returns_comparison_entity (status agg)~117,000 ms0.11 ms1,000,000x

What Happens on Failure?

PostgreSQL triggers run inside the same transaction as the INSERT:

  • If trigger fails: Entire INSERT rolls back. No partial state.
  • If INSERT fails: Triggers never fire. Tables unchanged.
  • Concurrent uploads: Each gets its own transaction. PostgreSQL handles locking.

This is safer than manual REFRESH MATERIALIZED VIEW which can leave you with stale data if it fails mid-refresh.

The View Still Exists

Important: We didn't delete gold.v_dietz_daily. The view still exists and contains the calculation logic. The trigger USES the view to compute fresh data:

INSERT INTO gold.dietz_daily
SELECT * FROM gold.v_dietz_daily -- View does the math
WHERE ... -- We just scope it

Benefits:

  • Single source of truth for calculation logic (the view)
  • Table is just a cache
  • If logic changes, update the view → next upload refreshes the cache

When to Use This Pattern

Good fit:

  • Views taking > 1 second
  • Data changes in batches (daily uploads)
  • Predictable query patterns
  • Can identify "what changed" for scoped refresh

Bad fit:

  • Frequent small updates (use regular indexes instead)
  • Can't determine scope (need full refresh anyway)
  • Simple queries that don't need caching

Comparison with MATERIALIZED VIEW

FeatureMATERIALIZED VIEWTrigger-Based Table
RefreshManual: REFRESH MATERIALIZED VIEWAutomatic on INSERT
ScopeFull table every timeOnly affected rows
BlockingLocks table during refreshNo blocking
ComplexitySimple (1 command)More code (trigger + function)
Failure modeCan leave stale dataTransaction rollback (atomic)

We chose trigger-based tables because:

  1. Automatic - No cron job or manual refresh needed
  2. Scoped - 2 rows instead of 3,310 rows
  3. Non-blocking - Queries keep working during refresh
  4. Atomic - Fails cleanly if something goes wrong

Tables

gold.dietz_daily

CREATE TABLE gold.dietz_daily (
entity_id BIGINT,
entity_name TEXT,
pnl_date DATE,
daily_return_pct NUMERIC,
daily_pnl NUMERIC,
capital_amount NUMERIC,
end_nav NUMERIC,
btig_daily_ror NUMERIC,
daily_diff_bps NUMERIC,
return_factor NUMERIC,
PRIMARY KEY (entity_id, pnl_date)
);

CREATE INDEX idx_dietz_daily_entity_name
ON gold.dietz_daily(entity_name, pnl_date);

gold.returns_comparison_entity

CREATE TABLE gold.returns_comparison_entity (
entity_id BIGINT,
entity_name TEXT,
month_end_date DATE,
sable_twr_pct NUMERIC,
btig_twr_pct NUMERIC,
diff_bps NUMERIC,
status TEXT,
sable_trading_days INT,
btig_trading_days INT,
sable_avg_capital NUMERIC,
btig_avg_market_value NUMERIC,
sable_total_pnl NUMERIC,
btig_total_cashflow NUMERIC,
PRIMARY KEY (entity_id, month_end_date)
);