Trigger-Based Materialized Tables

Overview

Sable uses trigger-based materialized tables to achieve sub-millisecond query performance on complex analytical views. Instead of recalculating millions of rows on every query, we pre-compute results and cache them in indexed tables that auto-refresh when source data changes.

The Problem

Views like gold.v_dietz_daily were taking 117+ seconds per query. Why?

User queries v_dietz_daily
         │
         ▼
┌─────────────────────────────────────────┐
│  v_dietz_daily                          │
│    └── v_capital_daily                  │
│          └── v_nav_daily                │
│                └── v_pnl_daily          │
│                      └── pnl_daily_v2   │  ← 1M+ rows scanned
└─────────────────────────────────────────┘
         │
         ▼
    117 seconds later... results

Each query recalculates the entire view hierarchy from scratch. No caching, no indexes on intermediate results.

The Solution

Cache the view's output in a regular table. Refresh only what changed.

User queries gold.dietz_daily (TABLE)
         │
         ▼
┌─────────────────────────────────────────┐
│  Indexed table with 3,310 pre-computed  │
│  rows ready to go                       │
└─────────────────────────────────────────┘
         │
         ▼
    5ms later... results

How It Works: A Concrete Example

Scenario: Daily upload of 1,000 P&L rows for January 15, 2026.

Step 1: Data Arrives

INSERT INTO sable.pnl_daily_v2 (entity_id, pnl_date, daily_pnl, ...)
VALUES 
  (1, '2026-01-15', 50000, ...),
  (1, '2026-01-15', -3000, ...),
  (2, '2026-01-15', 12000, ...),
  -- ... 997 more rows

Step 2: Trigger Fires

The AFTER INSERT trigger automatically fires. It has access to ALL inserted rows via REFERENCING NEW TABLE AS new_rows:

-- Trigger can see all 1,000 rows at once (not row-by-row)
SELECT * FROM new_rows;  -- All 1,000 rows available here

Step 3: Scoped Refresh (The Key Insight)

The trigger doesn't recalculate everything. It asks: "What's the earliest date in this upload?"

SELECT MIN(pnl_date), ARRAY_AGG(DISTINCT entity_id)
INTO v_min_date, v_entity_ids
FROM new_rows;

-- Result: v_min_date = '2026-01-15', v_entity_ids = [1, 2]

Then it only refreshes from that date forward for those entities:

-- Delete stale rows (only Jan 15+ for entities 1 and 2)
DELETE FROM gold.dietz_daily
WHERE entity_id = ANY(ARRAY[1, 2])
  AND pnl_date >= '2026-01-15';
-- Deletes maybe 2 rows (today's data if it existed)

-- Recompute from the view (scoped query)
INSERT INTO gold.dietz_daily
SELECT * FROM gold.v_dietz_daily  -- View still exists! We use it here.
WHERE entity_id = ANY(ARRAY[1, 2])
  AND pnl_date >= '2026-01-15';
-- Inserts 2 fresh rows

Result: Instead of recalculating 3,310 rows, we only touched 2 rows. The view query runs scoped so it's fast (~2 seconds vs 117 seconds).

Step 4: Trigger Chain

Multiple triggers fire in sequence, each depending on the previous:

INSERT INTO pnl_daily_v2 (1,000 rows for Jan 15)
         │
         ▼
┌─────────────────────────────────────────────────────┐
│  1. tr_bake_daily_pnl_v2                            │
│     Computes daily_pnl column on inserted rows      │
│                    │                                │
│                    ▼                                │
│  2. tr_refresh_nav_v2                               │
│     Updates nav_daily_v2 for Jan 15+                │
│                    │                                │
│                    ▼                                │
│  3. tr_refresh_capital_v2                           │
│     Updates capital_daily_v2 for Jan 15+            │
│     (needs NAV from step 2)                         │
│                    │                                │
│                    ▼                                │
│  4. tr_refresh_dietz_daily                          │
│     Updates gold.dietz_daily for Jan 15+            │
│     (needs capital from step 3)                     │
│                    │                                │
│                    ▼                                │
│  5. tr_refresh_returns_comparison                   │
│     Updates gold.returns_comparison_entity          │
│     (needs dietz from step 4)                       │
└─────────────────────────────────────────────────────┘
         │
         ▼
    All tables current. Total time: ~5 seconds.

Order matters! Each trigger depends on the previous one's output. If you compute returns before NAV is updated, you get stale data.

The Code

Trigger Function

CREATE OR REPLACE FUNCTION gold.fn_refresh_dietz_daily()
RETURNS TRIGGER AS $func$
DECLARE
    v_min_date DATE;
    v_entity_ids BIGINT[];
BEGIN
    -- 1. Get scope from the uploaded batch
    SELECT MIN(pnl_date), ARRAY_AGG(DISTINCT entity_id)
    INTO v_min_date, v_entity_ids
    FROM new_rows
    WHERE entity_id IS NOT NULL;
    
    -- 2. Early exit if nothing to do
    IF v_min_date IS NULL THEN
        RETURN NULL;
    END IF;
    
    -- 3. Delete stale rows (scoped)
    DELETE FROM gold.dietz_daily
    WHERE entity_id = ANY(v_entity_ids)
      AND pnl_date >= v_min_date;
    
    -- 4. Recompute from view (scoped)
    INSERT INTO gold.dietz_daily (
        entity_id, entity_name, pnl_date, daily_return_pct,
        daily_pnl, capital_amount, end_nav, btig_daily_ror,
        daily_diff_bps, return_factor
    )
    SELECT 
        entity_id, entity_name, pnl_date, daily_return_pct,
        daily_pnl, capital_amount, end_nav, btig_daily_ror,
        daily_diff_bps, return_factor
    FROM gold.v_dietz_daily
    WHERE entity_id = ANY(v_entity_ids)
      AND pnl_date >= v_min_date;
    
    RETURN NULL;
END;
$func$ LANGUAGE plpgsql;

Statement-Level Trigger

CREATE TRIGGER tr_refresh_dietz_daily
    AFTER INSERT ON sable.pnl_daily_v2
    REFERENCING NEW TABLE AS new_rows  -- Key: batch access to all rows
    FOR EACH STATEMENT                  -- Fires once per INSERT, not per row
    EXECUTE FUNCTION gold.fn_refresh_dietz_daily();

Why statement-level? If you upload 1,000 rows:

Row-level trigger: fires 1,000 times (slow)
Statement-level trigger: fires once with access to all 1,000 rows (fast)

Performance Results

Query	Before (View)	After (Table)	Speedup
dietz_daily (single entity, 12mo)	4,479 ms	4.88 ms	917x
returns_comparison_entity (full scan)	~117,000 ms	0.17 ms	688,000x
returns_comparison_entity (status agg)	~117,000 ms	0.11 ms	1,000,000x

What Happens on Failure?

PostgreSQL triggers run inside the same transaction as the INSERT:

If trigger fails: Entire INSERT rolls back. No partial state.
If INSERT fails: Triggers never fire. Tables unchanged.
Concurrent uploads: Each gets its own transaction. PostgreSQL handles locking.

This is safer than manual REFRESH MATERIALIZED VIEW which can leave you with stale data if it fails mid-refresh.

The View Still Exists

Important: We didn't delete gold.v_dietz_daily. The view still exists and contains the calculation logic. The trigger USES the view to compute fresh data:

INSERT INTO gold.dietz_daily
SELECT * FROM gold.v_dietz_daily  -- View does the math
WHERE ...                          -- We just scope it

Benefits:

Single source of truth for calculation logic (the view)
Table is just a cache
If logic changes, update the view → next upload refreshes the cache

When to Use This Pattern

✅ Good fit:

Views taking > 1 second
Data changes in batches (daily uploads)
Predictable query patterns
Can identify "what changed" for scoped refresh

❌ Bad fit:

Frequent small updates (use regular indexes instead)
Can't determine scope (need full refresh anyway)
Simple queries that don't need caching

Comparison with MATERIALIZED VIEW

Feature	MATERIALIZED VIEW	Trigger-Based Table
Refresh	Manual: `REFRESH MATERIALIZED VIEW`	Automatic on INSERT
Scope	Full table every time	Only affected rows
Blocking	Locks table during refresh	No blocking
Complexity	Simple (1 command)	More code (trigger + function)
Failure mode	Can leave stale data	Transaction rollback (atomic)

We chose trigger-based tables because:

Automatic - No cron job or manual refresh needed
Scoped - 2 rows instead of 3,310 rows
Non-blocking - Queries keep working during refresh
Atomic - Fails cleanly if something goes wrong

Tables

gold.dietz_daily

CREATE TABLE gold.dietz_daily (
    entity_id BIGINT,
    entity_name TEXT,
    pnl_date DATE,
    daily_return_pct NUMERIC,
    daily_pnl NUMERIC,
    capital_amount NUMERIC,
    end_nav NUMERIC,
    btig_daily_ror NUMERIC,
    daily_diff_bps NUMERIC,
    return_factor NUMERIC,
    PRIMARY KEY (entity_id, pnl_date)
);

CREATE INDEX idx_dietz_daily_entity_name 
    ON gold.dietz_daily(entity_name, pnl_date);

gold.returns_comparison_entity

CREATE TABLE gold.returns_comparison_entity (
    entity_id BIGINT,
    entity_name TEXT,
    month_end_date DATE,
    sable_twr_pct NUMERIC,
    btig_twr_pct NUMERIC,
    diff_bps NUMERIC,
    status TEXT,
    sable_trading_days INT,
    btig_trading_days INT,
    sable_avg_capital NUMERIC,
    btig_avg_market_value NUMERIC,
    sable_total_pnl NUMERIC,
    btig_total_cashflow NUMERIC,
    PRIMARY KEY (entity_id, month_end_date)
);

sable-returns Dashboard - Live visualization service
Jetta Status - Service health monitoring

Overview​

The Problem​

The Solution​

How It Works: A Concrete Example​

Step 1: Data Arrives​

Step 2: Trigger Fires​

Step 3: Scoped Refresh (The Key Insight)​

Step 4: Trigger Chain​

The Code​

Trigger Function​

Statement-Level Trigger​

Performance Results​

What Happens on Failure?​

The View Still Exists​

When to Use This Pattern​

Comparison with MATERIALIZED VIEW​

Tables​

gold.dietz_daily​

gold.returns_comparison_entity​

Related​