Advanced SQL Mastery: From Window Functions to Full-Text Search

Advanced SQL Mastery: From Window Functions to Full-Text Search

A comprehensive guide to advanced SQL query optimization covering window functions, CTEs, execution plans, indexing strategies, transaction isolation, partitioning, JSON data, and full-text search. Master SQL performance with Table Plus, PostgreSQL CLI, Go + pgx, and DBeaver.

By Omar Flores

Advanced SQL Mastery: From Window Functions to Full-Text Search

Master Query Optimization with Production Tools and Real-World Examples


🎯 Introduction: Why Most Developers Never Master SQL

Let me be direct: Most developers treat SQL as a necessary evil.

They can write basic queries. They know SELECT, JOIN, and WHERE. But they’ve never explored the powerful features that transform SQL from β€œa database query language” into β€œa tool for analyzing, transforming, and understanding data.”

This is a tragedy. Because SQL mastery separates developers who:

Tier 1 Developer: "I wrote a query that works"
Tier 2 Developer: "I wrote a query that's correct, efficient, and maintainable"
Tier 3 Developer: "I wrote a query that solves the problem better than the application layer could"

The difference? 8 techniques that most developers never learn.

What This Guide Covers

This is not β€œSQL 101.” You already know basic queries.

This is: 8 advanced SQL techniques that make you dangerous (in the best way), with practical tools, real benchmarks, and Go integration.

We will cover:

βœ… Window Functions - Ranking, running totals, time-series analysis
βœ… CTEs - Recursive queries, hierarchical data, readability
βœ… Execution Plans - Reading, understanding, and optimizing
βœ… Advanced Indexing - Beyond B-Trees, index strategies
βœ… Transaction Isolation - ACID properties, deadlocks, locking
βœ… Partitioning - Scaling large tables, time-series data
βœ… JSON Data - Structured and semi-structured queries
βœ… Full-Text Search - Beyond LIKE, relevance ranking

The tools you’ll use:

  • PostgreSQL CLI - Direct SQL execution, EXPLAIN ANALYZE
  • Table Plus - Visual query analysis, result exploration
  • DBeaver - Index inspection, transaction monitoring
  • Go + pgx - High-performance benchmarking, production code examples

πŸ“Š Part 1: Window Functions - The Gateway Drug to Advanced SQL

Window functions are the entry point to advanced SQL. Once you understand them, you’ll use them constantly.

What Window Functions Are

A window function is a function that operates on a set of rows (the β€œwindow”) while maintaining the row identity.

Compare:

-- Aggregate function (loses row identity)
SELECT COUNT(*) FROM orders;
-- Result: 1 row with count 42

-- Window function (keeps row identity)
SELECT id, amount, COUNT(*) OVER () as total_count FROM orders;
-- Result: 42 rows, each with total_count = 42

The difference is subtle but profound. With window functions, you can:

  • Add aggregate results to each row
  • Rank rows without losing information
  • Calculate moving averages
  • Detect trends
  • All in SQL, without application logic

The Three Categories of Window Functions

1. Ranking Functions

-- Using Table Plus to visualize:
SELECT
    user_id,
    order_amount,
    ROW_NUMBER() OVER (ORDER BY order_amount DESC) as rank,
    RANK() OVER (ORDER BY order_amount DESC) as rank_with_ties,
    DENSE_RANK() OVER (ORDER BY order_amount DESC) as dense_rank
FROM orders
ORDER BY order_amount DESC;

Real difference:

amount | ROW_NUMBER | RANK | DENSE_RANK
-------|-----------|------|----------
$500   |     1     |  1   |    1
$500   |     2     |  1   |    1      ← Same rank with RANK/DENSE_RANK
$400   |     3     |  3   |    2      ← RANK skips, DENSE_RANK doesn't
$300   |     4     |  4   |    3

When to use each:

  • ROW_NUMBER() - When you need unique row numbers (pagination)
  • RANK() - When you need ranking but skip numbers (leaderboards)
  • DENSE_RANK() - When you need ranking without gaps (medal positions)

2. Offset Functions

-- LAG: Access previous row
-- LEAD: Access next row
SELECT
    date,
    revenue,
    LAG(revenue) OVER (ORDER BY date) as previous_day_revenue,
    LEAD(revenue) OVER (ORDER BY date) as next_day_revenue,
    revenue - LAG(revenue) OVER (ORDER BY date) as day_over_day_change
FROM daily_revenue
ORDER BY date;

Why this matters: Time-series analysis becomes trivial.

date    | revenue | previous | next | change
--------|---------|----------|------|--------
2026-01 | $1000   | NULL     | $900 | NULL
2026-02 | $900    | $1000    | $950 | -$100
2026-03 | $950    | $900     | $800 | +$50
2026-04 | $800    | $950     | NULL | -$150

With LAG/LEAD, you can detect:

  • Trends (is revenue going up or down?)
  • Anomalies (sudden drops)
  • Momentum (acceleration)

3. Aggregate Functions with OVER

-- Running total (cumulative sum)
SELECT
    date,
    revenue,
    SUM(revenue) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cumulative_revenue,
    AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg_7day
FROM daily_revenue
ORDER BY date;

This is powerful because:

  • No subqueries needed
  • No temporary tables
  • No application-side computation
  • All in pure SQL

Partitioning: The Secret Weapon

Window functions become truly powerful when you add PARTITION BY:

-- Rank users by spending, WITHIN each country
SELECT
    country,
    user_id,
    total_spent,
    RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) as rank_in_country
FROM users
ORDER BY country, rank_in_country;

Output:

country | user_id | spent | rank
--------|---------|-------|-----
US      | 123     | $5000 | 1
US      | 456     | $4000 | 2
US      | 789     | $3000 | 3
UK      | 321     | $6000 | 1      ← Different ranking per country
UK      | 654     | $4500 | 2

Real-World Example: Detecting Churn

-- Find users whose spending is declining
WITH user_spending_trends AS (
    SELECT
        user_id,
        DATE_TRUNC('month', order_date) as month,
        SUM(amount) as monthly_revenue,
        LAG(SUM(amount)) OVER (
            PARTITION BY user_id
            ORDER BY DATE_TRUNC('month', order_date)
        ) as previous_month_revenue
    FROM orders
    GROUP BY user_id, DATE_TRUNC('month', order_date)
)
SELECT
    user_id,
    month,
    monthly_revenue,
    previous_month_revenue,
    ROUND(
        (previous_month_revenue - monthly_revenue) / previous_month_revenue * 100, 2
    ) as percent_decline
FROM user_spending_trends
WHERE previous_month_revenue IS NOT NULL
  AND (previous_month_revenue - monthly_revenue) > 0
ORDER BY percent_decline DESC;

This finds users with declining spend. No application code needed.

Benchmarking Window Functions vs Subqueries

Using PostgreSQL CLI and Go:

# Test with EXPLAIN ANALYZE
\timing on

-- Subquery approach (slow)
EXPLAIN ANALYZE
SELECT
    u.user_id,
    (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id) as order_count
FROM users u;

-- Window function approach (fast)
EXPLAIN ANALYZE
SELECT
    u.user_id,
    COUNT(*) OVER (PARTITION BY user_id) as order_count
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id;

Typical results:

Subquery approach: 2,340 ms (with 10k users)
Window function:    85 ms
Speedup: 27.5x faster

Go Integration: Processing Results

package main

import (
	"database/sql"
	"fmt"
	"log"

	_ "github.com/lib/pq"
)

type RankedUser struct {
	UserID   int
	Name     string
	Spending float64
	Rank     int
	Country  string
}

func getTopSpendersByCountry(db *sql.DB) ([]RankedUser, error) {
	query := `
		SELECT
			user_id,
			name,
			total_spent as spending,
			RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) as rank,
			country
		FROM users
		WHERE RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) <= 10
		ORDER BY country, rank
	`

	rows, err := db.QueryContext(ctx, query)
	if err != nil {
		return nil, err
	}
	defer rows.Close()

	var results []RankedUser
	for rows.Next() {
		var r RankedUser
		err := rows.Scan(&r.UserID, &r.Name, &r.Spending, &r.Rank, &r.Country)
		if err != nil {
			return nil, err
		}
		results = append(results, r)
	}

	return results, rows.Err()
}

πŸ”„ Part 2: CTEs - Making Complex Queries Readable

CTEs (Common Table Expressions) look like subqueries, but they’re about readability and maintainability.

Non-Recursive CTEs: Turning Spaghetti into Structure

Instead of this (hard to read):

SELECT
    u.user_id,
    u.name,
    (
        SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id
    ) as total_orders,
    (
        SELECT SUM(amount) FROM orders o WHERE o.user_id = u.user_id
    ) as total_spent,
    (
        SELECT AVG(amount) FROM orders o WHERE o.user_id = u.user_id
    ) as avg_order_value
FROM users u
WHERE (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id) > 5;

Write this (crystal clear):

WITH user_order_stats AS (
    SELECT
        user_id,
        COUNT(*) as total_orders,
        SUM(amount) as total_spent,
        AVG(amount) as avg_order_value
    FROM orders
    GROUP BY user_id
)
SELECT
    u.user_id,
    u.name,
    s.total_orders,
    s.total_spent,
    s.avg_order_value
FROM users u
JOIN user_order_stats s ON u.user_id = s.user_id
WHERE s.total_orders > 5;

The structure is immediately clear:

  1. Calculate user stats
  2. Join with users
  3. Filter

Recursive CTEs: Hierarchical Data

-- Find organizational hierarchy starting from a CEO
WITH RECURSIVE org_hierarchy AS (
    -- Base case: Start with the CEO
    SELECT
        id,
        name,
        manager_id,
        1 as depth
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case: Find all subordinates
    SELECT
        e.id,
        e.name,
        e.manager_id,
        oh.depth + 1
    FROM employees e
    JOIN org_hierarchy oh ON e.manager_id = oh.id
)
SELECT
    REPEAT('  ', depth - 1) || name as organization_tree,
    id,
    depth
FROM org_hierarchy
ORDER BY depth, id;

Output visualizes the hierarchy:

organization_tree | id | depth
------------------|----|-----
CEO               | 1  | 1
  VP Engineering  | 2  | 2
    Developer 1   | 3  | 3
    Developer 2   | 4  | 3
  VP Sales        | 5  | 2
    Sales Rep 1   | 6  | 3

Visualizing in Table Plus

Table Plus excels at displaying recursive CTE results. The indentation shows the hierarchy clearly, and you can expand/collapse the tree-like structure.

Performance: Recursive CTEs Have Limits

# Using PostgreSQL CLI to test depth limits
EXPLAIN ANALYZE
WITH RECURSIVE deep_hierarchy AS (
    SELECT 1 as depth
    UNION ALL
    SELECT depth + 1 FROM deep_hierarchy WHERE depth < 1000
)
SELECT COUNT(*) FROM deep_hierarchy;

Gotcha: Default recursion limit is 100 (to prevent infinite loops).

SET work_mem = '256MB';  -- Increase memory for large recursion

WITH RECURSIVE deep_hierarchy AS (
    SELECT 1 as depth
    UNION ALL
    SELECT depth + 1 FROM deep_hierarchy WHERE depth < 10000
)
SELECT COUNT(*) FROM deep_hierarchy;

πŸ” Part 3: Query Execution Plans - Reading What the Database Thinks

Understanding execution plans is the bridge between β€œmy query works” and β€œmy query is optimal.”

Reading EXPLAIN ANALYZE Output

# In PostgreSQL CLI
EXPLAIN ANALYZE
SELECT u.user_id, u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY order_count DESC;

Output (simplified):

Sort  (cost=1000.00..1000.10 rows=40)  (actual time=25.432..25.450 rows=40)
  Sort Key: (count(o.id)) DESC
  ->  HashAggregate  (cost=900.00..950.00 rows=40)
        Group By: u.user_id, u.name
        ->  Hash Left Join  (cost=100.00..500.00 rows=1000)
              Hash Cond: (u.user_id = o.user_id)
              ->  Seq Scan on users u  (cost=0.00..50.00 rows=40)
              ->  Hash  (cost=75.00..75.00 rows=1000)
                    ->  Seq Scan on orders o  (cost=0.00..75.00 rows=1000)

What to read:

  • Seq Scan = Sequential scan (reading every row). Bad if you can index it.
  • Index Scan = Using an index. Good.
  • Hash Join = Putting one side in memory, scanning the other. Usually fast.
  • Nested Loop = For each row in left, scan right. Usually slow for large datasets.

Cost Numbers

cost=start..end rows=estimated
actual time=start..end rows=actual
  • cost=0.00..50.00 = Estimated cost from 0 to 50 units
  • rows=40 = Estimated 40 rows
  • actual time=25.432..25.450 = Actually took 25ms

Red flags:

rows=40 estimated, rows=8000 actual  ← Way off estimate!

This means the query planner made a bad guess, leading to suboptimal plan choice.

Using Table Plus for Visual Plans

Table Plus displays execution plans visually:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      SORT (DESC)    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1000ms, 40 rows    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
    β”‚ HASH AGG    β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
    β”‚ 500ms, 40r  β”‚
    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
           β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
    β”‚HASH L JOIN  β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
    β”‚ 100ms, 1kr  β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”˜
         β”‚     β”‚
    β”Œβ”€β”€β”€β”€β”΄β” β”Œβ”€β”€β”΄β”€β”€β”€β”€β”
    β”‚SCAN β”‚ β”‚HASH   β”‚
    β”‚USER β”‚ β”‚ORDERS β”‚
    β”‚25ms β”‚ β”‚75ms   β”‚
    β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”˜

Visually, you see:

  • How long each step takes
  • Row counts at each step
  • Where time is spent

Go Integration: Analyzing Plans Programmatically

package main

import (
	"database/sql"
	"encoding/json"
	"log"

	_ "github.com/lib/pq"
)

type PlanNode struct {
	NodeType        string      `json:"Node Type"`
	StartupCost     float64     `json:"Startup Cost"`
	TotalCost       float64     `json:"Total Cost"`
	EstimatedRows   float64     `json:"Estimated Rows"`
	ActualRows      int64       `json:"Actual Rows"`
	Plans           []PlanNode  `json:"Plans"`
}

func analyzePlan(db *sql.DB, query string) error {
	explainQuery := "EXPLAIN (FORMAT JSON, ANALYZE) " + query

	var jsonPlan string
	err := db.QueryRow(explainQuery).Scan(&jsonPlan)
	if err != nil {
		return err
	}

	var plans []map[string]interface{}
	err = json.Unmarshal([]byte(jsonPlan), &plans)
	if err != nil {
		return err
	}

	// Parse and analyze
	for _, plan := range plans {
		if planNode, ok := plan["Plan"].(map[string]interface{}); ok {
			log.Printf("Cost: %v, Rows: %v",
				planNode["Total Cost"],
				planNode["Actual Rows"])
		}
	}

	return nil
}

πŸ“‡ Part 4: Advanced Indexing - Beyond B-Trees

Most developers only know B-Tree indexes. But PostgreSQL offers 6+ index types, each optimized for different scenarios.

Index Types at a Glance

B-Tree (default)        β†’ General purpose, supports all comparisons
Hash                    β†’ Only equality, very fast lookups
GiST (Generalized)      β†’ Geometric data, full-text search
GIN (Inverted)          β†’ Array data, JSONB, full-text
BRIN                    β†’ Very large tables, sequential data
SP-GiST                 β†’ Partitioned spaces, geometrics

Strategy 1: Composite Indexes Matter

Column order in a composite index affects performance dramatically.

-- Query: Find orders by user_id, then filter by status
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Bad index (status first)
CREATE INDEX idx_orders_status_userid ON orders(status, user_id);
-- Database uses status filter first, wastes time

-- Good index (most selective first)
CREATE INDEX idx_orders_userid_status ON orders(user_id, status);
-- Database filters user_id first (more selective), then status

Test with DBeaver’s index analyzer to see the difference.

Strategy 2: Partial Indexes for Filtered Data

-- Instead of indexing all orders:
CREATE INDEX idx_all_orders ON orders(user_id);  -- 10M rows

-- Index only active orders (90% smaller):
CREATE INDEX idx_active_orders ON orders(user_id)
WHERE status = 'active' OR status = 'pending';  -- 1M rows

Queries that match the WHERE clause use the smaller index:

SELECT * FROM orders WHERE user_id = 123 AND status = 'active';
-- Uses the partial index! Much faster.

Strategy 3: Covering Indexes

Include non-key columns to enable index-only scans:

-- Without covering:
CREATE INDEX idx_orders_user ON orders(user_id);
SELECT order_id, amount FROM orders WHERE user_id = 123;
-- Index used to find rows, then table accessed for amount

-- With covering:
CREATE INDEX idx_orders_user_covering ON orders(user_id) INCLUDE (order_id, amount);
SELECT order_id, amount FROM orders WHERE user_id = 123;
-- Everything in index! Zero table access.

Detecting Index Bloat with PostgreSQL CLI

# Connect and check index sizes
\d+ orders
# Shows all indexes and their sizes

# Query to find unused indexes
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
# idx_scan = 0 means never used!

πŸ” Part 5: Transaction Isolation - ACID in Practice

The Four Isolation Levels

Each level allows different anomalies (inconsistencies):

READ UNCOMMITTED   β†’ Dirty reads (read uncommitted changes)
READ COMMITTED     β†’ Non-repeatable reads (value changes between reads)
REPEATABLE READ    β†’ Phantom reads (new rows added between reads)
SERIALIZABLE       β†’ No anomalies (slowest, most restrictive)

Testing Isolation Levels with Table Plus

Open two sessions in Table Plus:

Session 1:

BEGIN TRANSACTION;
SELECT balance FROM accounts WHERE id = 1;
-- Balance is $1000

COMMIT;

Session 2 (while Session 1 is open):

UPDATE accounts SET balance = 500 WHERE id = 1;
COMMIT;

Session 1 (what does it see?):

Isolation Level        Read 1    Read 2 After Session 2 Update
─────────────────────────────────────────────────────────────
READ UNCOMMITTED       $1000     $500  (dirty read!)
READ COMMITTED         $1000     $500  (non-repeatable)
REPEATABLE READ        $1000     $1000 (consistent!)
SERIALIZABLE           $1000     $1000 (consistent!)

Setting Isolation Levels

-- Session 1
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT balance FROM accounts WHERE id = 1;

-- Session 2
UPDATE accounts SET balance = 500 WHERE id = 1;

-- Session 1 (unaffected)
SELECT balance FROM accounts WHERE id = 1;  -- Still $1000!
COMMIT;

Deadlock Detection and Resolution

-- Transaction 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
WAITFOR DELAY '00:00:05';  -- Wait 5 seconds
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- Transaction 2 (at same time)
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
WAITFOR DELAY '00:00:05';  -- Wait 5 seconds
UPDATE accounts SET balance = balance - 100 WHERE id = 1;  -- DEADLOCK!
COMMIT;

Both transactions wait for each other. Deadlock detected!

PostgreSQL automatically aborts one transaction:

ERROR: deadlock detected

Resolution in Go:

func transferWithRetry(db *sql.DB, fromID, toID int, amount float64) error {
	maxRetries := 3
	for attempt := 0; attempt < maxRetries; attempt++ {
		err := transfer(db, fromID, toID, amount)

		if err != nil && strings.Contains(err.Error(), "deadlock") {
			log.Printf("Deadlock detected, retry %d/%d", attempt+1, maxRetries)
			time.Sleep(time.Millisecond * time.Duration(100*(attempt+1)))
			continue
		}

		return err
	}
	return fmt.Errorf("deadlock after %d retries", maxRetries)
}

πŸ—‚οΈ Part 6: Partitioning - Scaling to Billions

Partitioning divides large tables into smaller, manageable pieces.

Range Partitioning (Most Common)

-- Create parent table (not holding data)
CREATE TABLE orders (
    id BIGSERIAL,
    user_id INT,
    order_date DATE,
    amount DECIMAL,
    PRIMARY KEY (id, order_date)
) PARTITION BY RANGE (YEAR(order_date));

-- Create partitions by year
CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- New orders automatically routed to correct partition
INSERT INTO orders (user_id, order_date, amount)
VALUES (123, '2025-06-15', 299.99);
-- Goes into orders_2025 automatically!

Automatic Partition Maintenance with pg_partman

-- Install pg_partman extension
CREATE EXTENSION pg_partman;

-- Auto-create new partitions
SELECT partman.create_parent(
    'public.orders',
    'order_date',
    'native',
    'monthly'  -- Create new partition monthly
);

-- Now PostgreSQL automatically creates new partitions!

Query Planning with Partitions

EXPLAIN ANALYZE
SELECT * FROM orders WHERE order_date BETWEEN '2025-01-01' AND '2025-03-31';

Output shows partition elimination:

Append  (cost=0.00..500.00)
  ->  Seq Scan on orders_2025  (cost=0.00..500.00)  ← Only scans 2025!
        Filter: (order_date BETWEEN 2025-01-01 AND 2025-03-31)

-- Orders_2024, orders_2023, etc. are skipped!

This is the power of partitioning: scanning only relevant partitions.


πŸ“¦ Part 7: JSON and Document Data

Modern applications often store JSON in relational databases. SQL provides powerful tools for querying it.

JSONB vs JSON

CREATE TABLE user_profiles (
    id INT,
    metadata JSON,      -- Text storage, slower queries
    data JSONB          -- Binary storage, faster queries
);

-- JSONB is almost always better

Querying JSON

INSERT INTO user_profiles VALUES (1, '{"name": "Alice", "age": 30, "tags": ["admin", "developer"]}');

-- Extract values
SELECT
    data->>'name' as name,               -- Text value
    (data->>'age')::INT as age,          -- Cast to integer
    data->'tags' as tags_array,          -- JSON array
    data->'tags'->>0 as first_tag        -- First array element
FROM user_profiles;

-- Results:
-- name: "Alice"
-- age: 30
-- tags_array: ["admin", "developer"]
-- first_tag: "admin"

Indexing JSON for Performance

-- Create index on JSON field
CREATE INDEX idx_user_name ON user_profiles USING GIN (data);

-- Now queries are fast:
SELECT * FROM user_profiles WHERE data->>'name' = 'Alice';

-- Create targeted index for specific key
CREATE INDEX idx_user_tags ON user_profiles USING GIN ((data->'tags'));

Real-World: API Response Storage

type UserEvent struct {
    ID        int
    UserID    int
    EventType string
    Payload   json.RawMessage // Store raw JSON
    CreatedAt time.Time
}

// In database:
INSERT INTO user_events (user_id, event_type, payload)
VALUES (
    123,
    'purchase',
    '{"amount": 99.99, "product_id": 456, "currency": "USD"}'::jsonb
);

// Query events with specific payload values:
SELECT * FROM user_events
WHERE payload->>'currency' = 'USD'
  AND (payload->>'amount')::DECIMAL > 50;

πŸ”€ Part 8: Full-Text Search - Beyond LIKE

LIKE searches are slow. Full-text search is fast and intelligent.

The Problem with LIKE

-- This is slow on large datasets
SELECT * FROM products WHERE name LIKE '%laptop%';
-- Scans every row, checks substring match

-- This is even slower
SELECT * FROM products WHERE name LIKE '%lap%top%';
-- Multiple substring matches per row

Full-Text Search Solution

-- Create full-text index
CREATE INDEX idx_products_search ON products USING GIN (
    to_tsvector('english', name || ' ' || description)
);

-- Query is fast
SELECT * FROM products
WHERE to_tsvector('english', name || ' ' || description) @@
      plainto_tsquery('english', 'laptop');

-- Uses index, returns instantly

Relevance Ranking

SELECT
    id,
    name,
    ts_rank(
        to_tsvector('english', name || ' ' || description),
        plainto_tsquery('english', 'laptop')
    ) as relevance
FROM products
WHERE to_tsvector('english', name || ' ' || description) @@
      plainto_tsquery('english', 'laptop')
ORDER BY relevance DESC;

-- Results ranked by relevance:
-- "Gaming Laptop 17 Inch" (relevance: 0.95)
-- "Laptop Cooling Pad" (relevance: 0.65)
-- "Backpack for Laptop" (relevance: 0.48)

Fuzzy Matching with Trigrams

-- Typo tolerance with similarity
SELECT * FROM products
WHERE similarity(name, 'lapto') > 0.3;  -- Similar to 'lapto'

-- Returns "laptop", "lapto", etc.

-- Ranked by similarity
SELECT
    name,
    similarity(name, 'lapto') as match_score
FROM products
WHERE similarity(name, 'lapto') > 0.3
ORDER BY match_score DESC;

πŸš€ Part 9: Putting It All Together - The Advanced SQL Developer

Now you understand the 8 techniques. Here’s how they combine in real applications:

Example: E-Commerce Analytics Platform

WITH product_metrics AS (
    SELECT
        p.product_id,
        p.name,
        COUNT(*) as total_orders,
        SUM(oi.quantity) as total_sold,
        AVG(oi.unit_price) as avg_price,
        RANK() OVER (ORDER BY SUM(oi.quantity) DESC) as popularity_rank
    FROM products p
    LEFT JOIN order_items oi ON p.product_id = oi.product_id
    LEFT JOIN orders o ON oi.order_id = o.id
    WHERE o.order_date >= DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY p.product_id, p.name
),
trending_products AS (
    SELECT
        product_id,
        name,
        total_orders,
        total_sold,
        avg_price,
        popularity_rank,
        ROUND(
            total_sold / LAG(total_sold) OVER (
                ORDER BY product_id
            ) * 100 - 100, 2
        ) as growth_percent
    FROM product_metrics
    WHERE popularity_rank <= 100
)
SELECT * FROM trending_products
WHERE growth_percent > 50
ORDER BY growth_percent DESC;

This single query combines:

  • Window functions (RANK, LAG)
  • CTEs (product_metrics, trending_products)
  • Aggregation (GROUP BY)
  • Joins (multiple tables)

All executed efficiently by the database.


πŸŽ“ Conclusion: The Path Forward

You now understand the 8 advanced SQL techniques that separate tier-2 from tier-3 developers.

Next steps:

  1. Practice with Table Plus - Visualize execution plans for your queries
  2. Benchmark in Go - Compare different approaches with real performance data
  3. Use DBeaver - Explore index statistics and optimization opportunities
  4. Build projects - Apply these techniques to real problems

The developers who master SQL become database efficiency experts in their organizations. They solve in SQL what others would solve in application code. They understand where the real work happens.

That’s the power of advanced SQL.


Recommended Reading:

  • PostgreSQL Official Documentation (Chapters on Queries, Indexes, Performance)
  • β€œPostgreSQL 14 Internals” by Egor Rogov
  • Joe Abbate’s β€œThe Art of PostgreSQL”

Tools to Master:

  • PostgreSQL CLI (psql) - Raw power
  • Table Plus - Visual exploration
  • DBeaver - Enterprise features
  • pgAdmin - Web-based management
  • Go + pgx - Application integration

Start with one technique. Master it. Move to the next. Six months from now, you’ll be the SQL expert on your team.

Tags

#SQL #PostgreSQL #Query Optimization #Performance #Go #Table Plus #DBeaver #Database Design #Window Functions #CTEs #Indexing #JSON #Full-Text Search