Skip to main content

Filter-First Pattern

The filter-first pattern is the core optimization strategy used in Sable to achieve dramatic query speedups (up to 66,000x).

The Problem

PostgreSQL views don't push down filters efficiently through complex CTEs and JOINs:

-- SLOW: View with post-filter (scans ALL data first)
SELECT * FROM gold.v_nav_daily WHERE entity_id = 1; -- 10+ seconds

The view scans the entire table, performs all joins, then filters. This is backwards.

The Solution

Create functions that accept filter parameters and apply them FIRST, before any joins or aggregations:

-- FAST: Filter-first function (filters before aggregation)
SELECT * FROM gold.f_get_nav_daily(org_id, entity_id, start_date, end_date); -- 60ms

Performance Results

QueryBeforeAfterSpeedup
pnl_daily_filtered9.8s0.15ms66,000x
f_get_nav_daily10s60ms165x
f_get_capital_daily10s70ms143x
f_get_return_outliers32.7s545ms60x

Function Template

CREATE OR REPLACE FUNCTION gold.f_get_<entity>_data(
p_org_id UUID,
p_entity_id BIGINT,
p_start_date DATE,
p_end_date DATE
)
RETURNS TABLE (...)
LANGUAGE plpgsql STABLE
AS $$
BEGIN
RETURN QUERY
SELECT ...
FROM gold.source_table p
WHERE p.org_id = p_org_id -- Filter FIRST
AND p.entity_id = p_entity_id -- Filter FIRST
AND p.pnl_date BETWEEN p_start_date AND p_end_date
AND p.deleted_at IS NULL -- Required for partial index
...
END;
$$;

Key Principles

1. Filter Early, Join Late

-- BAD: Join everything, then filter
WITH all_data AS (
SELECT * FROM table_a
JOIN table_b ON ...
JOIN table_c ON ...
)
SELECT * FROM all_data WHERE entity_id = 1;

-- GOOD: Filter first, then join
WITH filtered AS (
SELECT * FROM table_a WHERE entity_id = 1
)
SELECT * FROM filtered
JOIN table_b ON ...
JOIN table_c ON ...;

2. Avoid View Chains

-- BAD: View chain where filters don't propagate
v_capital_daily → v_nav_daily → v_pnl_daily → pnl_daily

-- GOOD: Direct function calls with filters
f_get_capital_daily(org, entity)
→ f_get_nav_daily(org, entity)
→ pnl_daily (with index)

3. Use Partial Indexes

Queries MUST include deleted_at IS NULL to use partial indexes:

CREATE INDEX idx_pnl_daily_entity_org_date
ON gold.pnl_daily
USING btree (entity_id, org_id, pnl_date DESC)
WHERE deleted_at IS NULL;

-- Query MUST include the partial index condition
SELECT * FROM gold.pnl_daily
WHERE entity_id = 1
AND deleted_at IS NULL; -- REQUIRED!

Fast Functions Available

FunctionPurposeSpeedup
gold.f_get_nav_daily(org_id, entity_id, start_date, end_date)Direct NAV calculation165x
gold.f_get_capital_daily(org_id, entity_id, start_date, end_date)Capital with cash flows143x
gold.f_get_return_outliers(org_id, entity_id, threshold_bps, limit)Outlier detection60x

Benchmark API

Test query performance at:

GET https://sable-data.jettaintelligence.com/api/benchmark/

Performance thresholds:

  • Fast: < 50ms
  • Good: 50-200ms
  • Slow: 200-1000ms
  • Critical: > 1000ms