Skip to main content

SQL Best Practices

Patterns and anti-patterns discovered while optimizing Sable queries.

Good Patterns

1. CTE Column Aliasing

PL/pgSQL functions with RETURNS TABLE create implicit variables. Always alias CTE columns to avoid conflicts:

-- BAD: Ambiguous column names
CREATE FUNCTION example()
RETURNS TABLE (entity_id BIGINT, value NUMERIC)
AS $$
BEGIN
RETURN QUERY
WITH cte AS (
SELECT entity_id, value FROM source_table -- AMBIGUOUS!
)
SELECT entity_id, value FROM cte;
END;
$$;

-- GOOD: Prefixed aliases
CREATE FUNCTION example()
RETURNS TABLE (entity_id BIGINT, value NUMERIC)
AS $$
BEGIN
RETURN QUERY
WITH cte AS (
SELECT
t.entity_id AS cte_entity_id,
t.value AS cte_value
FROM source_table t
)
SELECT cte_entity_id, cte_value FROM cte;
END;
$$;

2. When to Use What

StructureUse WhenCreation Method
ViewSimple transforms, no filter params neededdbt model
Table Function1-5 filter dimensionsdbt macro
Materialized ViewHeavy computation, staleness OKdbt model
TableAnalytics datadbt model

3. Qualify All Columns

-- BAD: Unqualified
SELECT entity_id FROM some_table;

-- GOOD: Always qualify
SELECT t.entity_id FROM some_table t;

Anti-Patterns

1. Views That Scan Everything

-- BAD: View scans everything, then you filter
CREATE VIEW v_daily_returns AS
SELECT e.entity_id, p.report_date, p.daily_return
FROM entities e
JOIN pnl p ON e.entity_id = p.entity_id;

SELECT * FROM v_daily_returns WHERE entity_id = 123; -- SLOW

Solution: Convert to parameterized function.

2. CROSS JOIN with generate_series

-- BAD: Creates cartesian product
WITH date_spine AS (
SELECT generate_series('2020-01-01'::date, CURRENT_DATE, '1 day')
)
SELECT * FROM entities CROSS JOIN date_spine; -- Explosion!

Solution: Use LATERAL joins or filter the date range in parameters.

3. SELECT * in Production

-- BAD: Unknown columns, order, breaking changes
SELECT * FROM gold.entity_pnl;

-- GOOD: Explicit columns
SELECT entity_id, report_date, daily_return_bps FROM gold.entity_pnl;

Diagnosing Slow Queries

EXPLAIN ANALYZE

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM gold.v_daily_returns WHERE entity_id = 123 LIMIT 1;

What to Look For

SymptomProblemSolution
Seq Scan on large tableMissing index or no filter pushdownAdd index or use function
Nested Loop with high rowsCartesian productCheck JOIN conditions
High buffersScanning too much dataFilter earlier
Sort with high memorySorting large resultAdd index for ORDER BY

Quick Reference

ProblemSolution
View is slowConvert to parameterized function
Column ambiguous errorAlias all CTE columns with prefix
Full table scanAdd WHERE clause or use function
Query plan shows Seq ScanCheck indexes, use filter-first