Advanced SQL Mastery: From Window Functions to Full-Text Search
A comprehensive guide to advanced SQL query optimization covering window functions, CTEs, execution plans, indexing strategies, transaction isolation, partitioning, JSON data, and full-text search. Master SQL performance with Table Plus, PostgreSQL CLI, Go + pgx, and DBeaver.
Advanced SQL Mastery: From Window Functions to Full-Text Search
Master Query Optimization with Production Tools and Real-World Examples
π― Introduction: Why Most Developers Never Master SQL
Let me be direct: Most developers treat SQL as a necessary evil.
They can write basic queries. They know SELECT, JOIN, and WHERE. But theyβve never explored the powerful features that transform SQL from βa database query languageβ into βa tool for analyzing, transforming, and understanding data.β
This is a tragedy. Because SQL mastery separates developers who:
Tier 1 Developer: "I wrote a query that works"
Tier 2 Developer: "I wrote a query that's correct, efficient, and maintainable"
Tier 3 Developer: "I wrote a query that solves the problem better than the application layer could"
The difference? 8 techniques that most developers never learn.
What This Guide Covers
This is not βSQL 101.β You already know basic queries.
This is: 8 advanced SQL techniques that make you dangerous (in the best way), with practical tools, real benchmarks, and Go integration.
We will cover:
β
Window Functions - Ranking, running totals, time-series analysis
β
CTEs - Recursive queries, hierarchical data, readability
β
Execution Plans - Reading, understanding, and optimizing
β
Advanced Indexing - Beyond B-Trees, index strategies
β
Transaction Isolation - ACID properties, deadlocks, locking
β
Partitioning - Scaling large tables, time-series data
β
JSON Data - Structured and semi-structured queries
β
Full-Text Search - Beyond LIKE, relevance ranking
The tools youβll use:
- PostgreSQL CLI - Direct SQL execution, EXPLAIN ANALYZE
- Table Plus - Visual query analysis, result exploration
- DBeaver - Index inspection, transaction monitoring
- Go + pgx - High-performance benchmarking, production code examples
π Part 1: Window Functions - The Gateway Drug to Advanced SQL
Window functions are the entry point to advanced SQL. Once you understand them, youβll use them constantly.
What Window Functions Are
A window function is a function that operates on a set of rows (the βwindowβ) while maintaining the row identity.
Compare:
-- Aggregate function (loses row identity)
SELECT COUNT(*) FROM orders;
-- Result: 1 row with count 42
-- Window function (keeps row identity)
SELECT id, amount, COUNT(*) OVER () as total_count FROM orders;
-- Result: 42 rows, each with total_count = 42
The difference is subtle but profound. With window functions, you can:
- Add aggregate results to each row
- Rank rows without losing information
- Calculate moving averages
- Detect trends
- All in SQL, without application logic
The Three Categories of Window Functions
1. Ranking Functions
-- Using Table Plus to visualize:
SELECT
user_id,
order_amount,
ROW_NUMBER() OVER (ORDER BY order_amount DESC) as rank,
RANK() OVER (ORDER BY order_amount DESC) as rank_with_ties,
DENSE_RANK() OVER (ORDER BY order_amount DESC) as dense_rank
FROM orders
ORDER BY order_amount DESC;
Real difference:
amount | ROW_NUMBER | RANK | DENSE_RANK
-------|-----------|------|----------
$500 | 1 | 1 | 1
$500 | 2 | 1 | 1 β Same rank with RANK/DENSE_RANK
$400 | 3 | 3 | 2 β RANK skips, DENSE_RANK doesn't
$300 | 4 | 4 | 3
When to use each:
ROW_NUMBER()- When you need unique row numbers (pagination)RANK()- When you need ranking but skip numbers (leaderboards)DENSE_RANK()- When you need ranking without gaps (medal positions)
2. Offset Functions
-- LAG: Access previous row
-- LEAD: Access next row
SELECT
date,
revenue,
LAG(revenue) OVER (ORDER BY date) as previous_day_revenue,
LEAD(revenue) OVER (ORDER BY date) as next_day_revenue,
revenue - LAG(revenue) OVER (ORDER BY date) as day_over_day_change
FROM daily_revenue
ORDER BY date;
Why this matters: Time-series analysis becomes trivial.
date | revenue | previous | next | change
--------|---------|----------|------|--------
2026-01 | $1000 | NULL | $900 | NULL
2026-02 | $900 | $1000 | $950 | -$100
2026-03 | $950 | $900 | $800 | +$50
2026-04 | $800 | $950 | NULL | -$150
With LAG/LEAD, you can detect:
- Trends (is revenue going up or down?)
- Anomalies (sudden drops)
- Momentum (acceleration)
3. Aggregate Functions with OVER
-- Running total (cumulative sum)
SELECT
date,
revenue,
SUM(revenue) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cumulative_revenue,
AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg_7day
FROM daily_revenue
ORDER BY date;
This is powerful because:
- No subqueries needed
- No temporary tables
- No application-side computation
- All in pure SQL
Partitioning: The Secret Weapon
Window functions become truly powerful when you add PARTITION BY:
-- Rank users by spending, WITHIN each country
SELECT
country,
user_id,
total_spent,
RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) as rank_in_country
FROM users
ORDER BY country, rank_in_country;
Output:
country | user_id | spent | rank
--------|---------|-------|-----
US | 123 | $5000 | 1
US | 456 | $4000 | 2
US | 789 | $3000 | 3
UK | 321 | $6000 | 1 β Different ranking per country
UK | 654 | $4500 | 2
Real-World Example: Detecting Churn
-- Find users whose spending is declining
WITH user_spending_trends AS (
SELECT
user_id,
DATE_TRUNC('month', order_date) as month,
SUM(amount) as monthly_revenue,
LAG(SUM(amount)) OVER (
PARTITION BY user_id
ORDER BY DATE_TRUNC('month', order_date)
) as previous_month_revenue
FROM orders
GROUP BY user_id, DATE_TRUNC('month', order_date)
)
SELECT
user_id,
month,
monthly_revenue,
previous_month_revenue,
ROUND(
(previous_month_revenue - monthly_revenue) / previous_month_revenue * 100, 2
) as percent_decline
FROM user_spending_trends
WHERE previous_month_revenue IS NOT NULL
AND (previous_month_revenue - monthly_revenue) > 0
ORDER BY percent_decline DESC;
This finds users with declining spend. No application code needed.
Benchmarking Window Functions vs Subqueries
Using PostgreSQL CLI and Go:
# Test with EXPLAIN ANALYZE
\timing on
-- Subquery approach (slow)
EXPLAIN ANALYZE
SELECT
u.user_id,
(SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id) as order_count
FROM users u;
-- Window function approach (fast)
EXPLAIN ANALYZE
SELECT
u.user_id,
COUNT(*) OVER (PARTITION BY user_id) as order_count
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id;
Typical results:
Subquery approach: 2,340 ms (with 10k users)
Window function: 85 ms
Speedup: 27.5x faster
Go Integration: Processing Results
package main
import (
"database/sql"
"fmt"
"log"
_ "github.com/lib/pq"
)
type RankedUser struct {
UserID int
Name string
Spending float64
Rank int
Country string
}
func getTopSpendersByCountry(db *sql.DB) ([]RankedUser, error) {
query := `
SELECT
user_id,
name,
total_spent as spending,
RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) as rank,
country
FROM users
WHERE RANK() OVER (PARTITION BY country ORDER BY total_spent DESC) <= 10
ORDER BY country, rank
`
rows, err := db.QueryContext(ctx, query)
if err != nil {
return nil, err
}
defer rows.Close()
var results []RankedUser
for rows.Next() {
var r RankedUser
err := rows.Scan(&r.UserID, &r.Name, &r.Spending, &r.Rank, &r.Country)
if err != nil {
return nil, err
}
results = append(results, r)
}
return results, rows.Err()
}
π Part 2: CTEs - Making Complex Queries Readable
CTEs (Common Table Expressions) look like subqueries, but theyβre about readability and maintainability.
Non-Recursive CTEs: Turning Spaghetti into Structure
Instead of this (hard to read):
SELECT
u.user_id,
u.name,
(
SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id
) as total_orders,
(
SELECT SUM(amount) FROM orders o WHERE o.user_id = u.user_id
) as total_spent,
(
SELECT AVG(amount) FROM orders o WHERE o.user_id = u.user_id
) as avg_order_value
FROM users u
WHERE (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.user_id) > 5;
Write this (crystal clear):
WITH user_order_stats AS (
SELECT
user_id,
COUNT(*) as total_orders,
SUM(amount) as total_spent,
AVG(amount) as avg_order_value
FROM orders
GROUP BY user_id
)
SELECT
u.user_id,
u.name,
s.total_orders,
s.total_spent,
s.avg_order_value
FROM users u
JOIN user_order_stats s ON u.user_id = s.user_id
WHERE s.total_orders > 5;
The structure is immediately clear:
- Calculate user stats
- Join with users
- Filter
Recursive CTEs: Hierarchical Data
-- Find organizational hierarchy starting from a CEO
WITH RECURSIVE org_hierarchy AS (
-- Base case: Start with the CEO
SELECT
id,
name,
manager_id,
1 as depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: Find all subordinates
SELECT
e.id,
e.name,
e.manager_id,
oh.depth + 1
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.id
)
SELECT
REPEAT(' ', depth - 1) || name as organization_tree,
id,
depth
FROM org_hierarchy
ORDER BY depth, id;
Output visualizes the hierarchy:
organization_tree | id | depth
------------------|----|-----
CEO | 1 | 1
VP Engineering | 2 | 2
Developer 1 | 3 | 3
Developer 2 | 4 | 3
VP Sales | 5 | 2
Sales Rep 1 | 6 | 3
Visualizing in Table Plus
Table Plus excels at displaying recursive CTE results. The indentation shows the hierarchy clearly, and you can expand/collapse the tree-like structure.
Performance: Recursive CTEs Have Limits
# Using PostgreSQL CLI to test depth limits
EXPLAIN ANALYZE
WITH RECURSIVE deep_hierarchy AS (
SELECT 1 as depth
UNION ALL
SELECT depth + 1 FROM deep_hierarchy WHERE depth < 1000
)
SELECT COUNT(*) FROM deep_hierarchy;
Gotcha: Default recursion limit is 100 (to prevent infinite loops).
SET work_mem = '256MB'; -- Increase memory for large recursion
WITH RECURSIVE deep_hierarchy AS (
SELECT 1 as depth
UNION ALL
SELECT depth + 1 FROM deep_hierarchy WHERE depth < 10000
)
SELECT COUNT(*) FROM deep_hierarchy;
π Part 3: Query Execution Plans - Reading What the Database Thinks
Understanding execution plans is the bridge between βmy query worksβ and βmy query is optimal.β
Reading EXPLAIN ANALYZE Output
# In PostgreSQL CLI
EXPLAIN ANALYZE
SELECT u.user_id, u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY order_count DESC;
Output (simplified):
Sort (cost=1000.00..1000.10 rows=40) (actual time=25.432..25.450 rows=40)
Sort Key: (count(o.id)) DESC
-> HashAggregate (cost=900.00..950.00 rows=40)
Group By: u.user_id, u.name
-> Hash Left Join (cost=100.00..500.00 rows=1000)
Hash Cond: (u.user_id = o.user_id)
-> Seq Scan on users u (cost=0.00..50.00 rows=40)
-> Hash (cost=75.00..75.00 rows=1000)
-> Seq Scan on orders o (cost=0.00..75.00 rows=1000)
What to read:
- Seq Scan = Sequential scan (reading every row). Bad if you can index it.
- Index Scan = Using an index. Good.
- Hash Join = Putting one side in memory, scanning the other. Usually fast.
- Nested Loop = For each row in left, scan right. Usually slow for large datasets.
Cost Numbers
cost=start..end rows=estimated
actual time=start..end rows=actual
cost=0.00..50.00= Estimated cost from 0 to 50 unitsrows=40= Estimated 40 rowsactual time=25.432..25.450= Actually took 25ms
Red flags:
rows=40 estimated, rows=8000 actual β Way off estimate!
This means the query planner made a bad guess, leading to suboptimal plan choice.
Using Table Plus for Visual Plans
Table Plus displays execution plans visually:
βββββββββββββββββββββββ
β SORT (DESC) β
βββββββββββββββββββββββ€
β 1000ms, 40 rows β
ββββββββββββ¬βββββββββββ
β
ββββββββ΄βββββββ
β HASH AGG β
βββββββββββββββ€
β 500ms, 40r β
ββββββββ¬βββββββ
β
ββββββββ΄βββββββ
βHASH L JOIN β
βββββββββββββββ€
β 100ms, 1kr β
ββββββ¬ββββββ¬βββ
β β
ββββββ΄β ββββ΄βββββ
βSCAN β βHASH β
βUSER β βORDERS β
β25ms β β75ms β
βββββββ βββββββββ
Visually, you see:
- How long each step takes
- Row counts at each step
- Where time is spent
Go Integration: Analyzing Plans Programmatically
package main
import (
"database/sql"
"encoding/json"
"log"
_ "github.com/lib/pq"
)
type PlanNode struct {
NodeType string `json:"Node Type"`
StartupCost float64 `json:"Startup Cost"`
TotalCost float64 `json:"Total Cost"`
EstimatedRows float64 `json:"Estimated Rows"`
ActualRows int64 `json:"Actual Rows"`
Plans []PlanNode `json:"Plans"`
}
func analyzePlan(db *sql.DB, query string) error {
explainQuery := "EXPLAIN (FORMAT JSON, ANALYZE) " + query
var jsonPlan string
err := db.QueryRow(explainQuery).Scan(&jsonPlan)
if err != nil {
return err
}
var plans []map[string]interface{}
err = json.Unmarshal([]byte(jsonPlan), &plans)
if err != nil {
return err
}
// Parse and analyze
for _, plan := range plans {
if planNode, ok := plan["Plan"].(map[string]interface{}); ok {
log.Printf("Cost: %v, Rows: %v",
planNode["Total Cost"],
planNode["Actual Rows"])
}
}
return nil
}
π Part 4: Advanced Indexing - Beyond B-Trees
Most developers only know B-Tree indexes. But PostgreSQL offers 6+ index types, each optimized for different scenarios.
Index Types at a Glance
B-Tree (default) β General purpose, supports all comparisons
Hash β Only equality, very fast lookups
GiST (Generalized) β Geometric data, full-text search
GIN (Inverted) β Array data, JSONB, full-text
BRIN β Very large tables, sequential data
SP-GiST β Partitioned spaces, geometrics
Strategy 1: Composite Indexes Matter
Column order in a composite index affects performance dramatically.
-- Query: Find orders by user_id, then filter by status
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- Bad index (status first)
CREATE INDEX idx_orders_status_userid ON orders(status, user_id);
-- Database uses status filter first, wastes time
-- Good index (most selective first)
CREATE INDEX idx_orders_userid_status ON orders(user_id, status);
-- Database filters user_id first (more selective), then status
Test with DBeaverβs index analyzer to see the difference.
Strategy 2: Partial Indexes for Filtered Data
-- Instead of indexing all orders:
CREATE INDEX idx_all_orders ON orders(user_id); -- 10M rows
-- Index only active orders (90% smaller):
CREATE INDEX idx_active_orders ON orders(user_id)
WHERE status = 'active' OR status = 'pending'; -- 1M rows
Queries that match the WHERE clause use the smaller index:
SELECT * FROM orders WHERE user_id = 123 AND status = 'active';
-- Uses the partial index! Much faster.
Strategy 3: Covering Indexes
Include non-key columns to enable index-only scans:
-- Without covering:
CREATE INDEX idx_orders_user ON orders(user_id);
SELECT order_id, amount FROM orders WHERE user_id = 123;
-- Index used to find rows, then table accessed for amount
-- With covering:
CREATE INDEX idx_orders_user_covering ON orders(user_id) INCLUDE (order_id, amount);
SELECT order_id, amount FROM orders WHERE user_id = 123;
-- Everything in index! Zero table access.
Detecting Index Bloat with PostgreSQL CLI
# Connect and check index sizes
\d+ orders
# Shows all indexes and their sizes
# Query to find unused indexes
SELECT
schemaname,
tablename,
indexname,
idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
# idx_scan = 0 means never used!
π Part 5: Transaction Isolation - ACID in Practice
The Four Isolation Levels
Each level allows different anomalies (inconsistencies):
READ UNCOMMITTED β Dirty reads (read uncommitted changes)
READ COMMITTED β Non-repeatable reads (value changes between reads)
REPEATABLE READ β Phantom reads (new rows added between reads)
SERIALIZABLE β No anomalies (slowest, most restrictive)
Testing Isolation Levels with Table Plus
Open two sessions in Table Plus:
Session 1:
BEGIN TRANSACTION;
SELECT balance FROM accounts WHERE id = 1;
-- Balance is $1000
COMMIT;
Session 2 (while Session 1 is open):
UPDATE accounts SET balance = 500 WHERE id = 1;
COMMIT;
Session 1 (what does it see?):
Isolation Level Read 1 Read 2 After Session 2 Update
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
READ UNCOMMITTED $1000 $500 (dirty read!)
READ COMMITTED $1000 $500 (non-repeatable)
REPEATABLE READ $1000 $1000 (consistent!)
SERIALIZABLE $1000 $1000 (consistent!)
Setting Isolation Levels
-- Session 1
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT balance FROM accounts WHERE id = 1;
-- Session 2
UPDATE accounts SET balance = 500 WHERE id = 1;
-- Session 1 (unaffected)
SELECT balance FROM accounts WHERE id = 1; -- Still $1000!
COMMIT;
Deadlock Detection and Resolution
-- Transaction 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
WAITFOR DELAY '00:00:05'; -- Wait 5 seconds
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- Transaction 2 (at same time)
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
WAITFOR DELAY '00:00:05'; -- Wait 5 seconds
UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- DEADLOCK!
COMMIT;
Both transactions wait for each other. Deadlock detected!
PostgreSQL automatically aborts one transaction:
ERROR: deadlock detected
Resolution in Go:
func transferWithRetry(db *sql.DB, fromID, toID int, amount float64) error {
maxRetries := 3
for attempt := 0; attempt < maxRetries; attempt++ {
err := transfer(db, fromID, toID, amount)
if err != nil && strings.Contains(err.Error(), "deadlock") {
log.Printf("Deadlock detected, retry %d/%d", attempt+1, maxRetries)
time.Sleep(time.Millisecond * time.Duration(100*(attempt+1)))
continue
}
return err
}
return fmt.Errorf("deadlock after %d retries", maxRetries)
}
ποΈ Part 6: Partitioning - Scaling to Billions
Partitioning divides large tables into smaller, manageable pieces.
Range Partitioning (Most Common)
-- Create parent table (not holding data)
CREATE TABLE orders (
id BIGSERIAL,
user_id INT,
order_date DATE,
amount DECIMAL,
PRIMARY KEY (id, order_date)
) PARTITION BY RANGE (YEAR(order_date));
-- Create partitions by year
CREATE TABLE orders_2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE orders_2025 PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
-- New orders automatically routed to correct partition
INSERT INTO orders (user_id, order_date, amount)
VALUES (123, '2025-06-15', 299.99);
-- Goes into orders_2025 automatically!
Automatic Partition Maintenance with pg_partman
-- Install pg_partman extension
CREATE EXTENSION pg_partman;
-- Auto-create new partitions
SELECT partman.create_parent(
'public.orders',
'order_date',
'native',
'monthly' -- Create new partition monthly
);
-- Now PostgreSQL automatically creates new partitions!
Query Planning with Partitions
EXPLAIN ANALYZE
SELECT * FROM orders WHERE order_date BETWEEN '2025-01-01' AND '2025-03-31';
Output shows partition elimination:
Append (cost=0.00..500.00)
-> Seq Scan on orders_2025 (cost=0.00..500.00) β Only scans 2025!
Filter: (order_date BETWEEN 2025-01-01 AND 2025-03-31)
-- Orders_2024, orders_2023, etc. are skipped!
This is the power of partitioning: scanning only relevant partitions.
π¦ Part 7: JSON and Document Data
Modern applications often store JSON in relational databases. SQL provides powerful tools for querying it.
JSONB vs JSON
CREATE TABLE user_profiles (
id INT,
metadata JSON, -- Text storage, slower queries
data JSONB -- Binary storage, faster queries
);
-- JSONB is almost always better
Querying JSON
INSERT INTO user_profiles VALUES (1, '{"name": "Alice", "age": 30, "tags": ["admin", "developer"]}');
-- Extract values
SELECT
data->>'name' as name, -- Text value
(data->>'age')::INT as age, -- Cast to integer
data->'tags' as tags_array, -- JSON array
data->'tags'->>0 as first_tag -- First array element
FROM user_profiles;
-- Results:
-- name: "Alice"
-- age: 30
-- tags_array: ["admin", "developer"]
-- first_tag: "admin"
Indexing JSON for Performance
-- Create index on JSON field
CREATE INDEX idx_user_name ON user_profiles USING GIN (data);
-- Now queries are fast:
SELECT * FROM user_profiles WHERE data->>'name' = 'Alice';
-- Create targeted index for specific key
CREATE INDEX idx_user_tags ON user_profiles USING GIN ((data->'tags'));
Real-World: API Response Storage
type UserEvent struct {
ID int
UserID int
EventType string
Payload json.RawMessage // Store raw JSON
CreatedAt time.Time
}
// In database:
INSERT INTO user_events (user_id, event_type, payload)
VALUES (
123,
'purchase',
'{"amount": 99.99, "product_id": 456, "currency": "USD"}'::jsonb
);
// Query events with specific payload values:
SELECT * FROM user_events
WHERE payload->>'currency' = 'USD'
AND (payload->>'amount')::DECIMAL > 50;
π€ Part 8: Full-Text Search - Beyond LIKE
LIKE searches are slow. Full-text search is fast and intelligent.
The Problem with LIKE
-- This is slow on large datasets
SELECT * FROM products WHERE name LIKE '%laptop%';
-- Scans every row, checks substring match
-- This is even slower
SELECT * FROM products WHERE name LIKE '%lap%top%';
-- Multiple substring matches per row
Full-Text Search Solution
-- Create full-text index
CREATE INDEX idx_products_search ON products USING GIN (
to_tsvector('english', name || ' ' || description)
);
-- Query is fast
SELECT * FROM products
WHERE to_tsvector('english', name || ' ' || description) @@
plainto_tsquery('english', 'laptop');
-- Uses index, returns instantly
Relevance Ranking
SELECT
id,
name,
ts_rank(
to_tsvector('english', name || ' ' || description),
plainto_tsquery('english', 'laptop')
) as relevance
FROM products
WHERE to_tsvector('english', name || ' ' || description) @@
plainto_tsquery('english', 'laptop')
ORDER BY relevance DESC;
-- Results ranked by relevance:
-- "Gaming Laptop 17 Inch" (relevance: 0.95)
-- "Laptop Cooling Pad" (relevance: 0.65)
-- "Backpack for Laptop" (relevance: 0.48)
Fuzzy Matching with Trigrams
-- Typo tolerance with similarity
SELECT * FROM products
WHERE similarity(name, 'lapto') > 0.3; -- Similar to 'lapto'
-- Returns "laptop", "lapto", etc.
-- Ranked by similarity
SELECT
name,
similarity(name, 'lapto') as match_score
FROM products
WHERE similarity(name, 'lapto') > 0.3
ORDER BY match_score DESC;
π Part 9: Putting It All Together - The Advanced SQL Developer
Now you understand the 8 techniques. Hereβs how they combine in real applications:
Example: E-Commerce Analytics Platform
WITH product_metrics AS (
SELECT
p.product_id,
p.name,
COUNT(*) as total_orders,
SUM(oi.quantity) as total_sold,
AVG(oi.unit_price) as avg_price,
RANK() OVER (ORDER BY SUM(oi.quantity) DESC) as popularity_rank
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN orders o ON oi.order_id = o.id
WHERE o.order_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY p.product_id, p.name
),
trending_products AS (
SELECT
product_id,
name,
total_orders,
total_sold,
avg_price,
popularity_rank,
ROUND(
total_sold / LAG(total_sold) OVER (
ORDER BY product_id
) * 100 - 100, 2
) as growth_percent
FROM product_metrics
WHERE popularity_rank <= 100
)
SELECT * FROM trending_products
WHERE growth_percent > 50
ORDER BY growth_percent DESC;
This single query combines:
- Window functions (RANK, LAG)
- CTEs (product_metrics, trending_products)
- Aggregation (GROUP BY)
- Joins (multiple tables)
All executed efficiently by the database.
π Conclusion: The Path Forward
You now understand the 8 advanced SQL techniques that separate tier-2 from tier-3 developers.
Next steps:
- Practice with Table Plus - Visualize execution plans for your queries
- Benchmark in Go - Compare different approaches with real performance data
- Use DBeaver - Explore index statistics and optimization opportunities
- Build projects - Apply these techniques to real problems
The developers who master SQL become database efficiency experts in their organizations. They solve in SQL what others would solve in application code. They understand where the real work happens.
Thatβs the power of advanced SQL.
Recommended Reading:
- PostgreSQL Official Documentation (Chapters on Queries, Indexes, Performance)
- βPostgreSQL 14 Internalsβ by Egor Rogov
- Joe Abbateβs βThe Art of PostgreSQLβ
Tools to Master:
- PostgreSQL CLI (
psql) - Raw power - Table Plus - Visual exploration
- DBeaver - Enterprise features
- pgAdmin - Web-based management
- Go + pgx - Application integration
Start with one technique. Master it. Move to the next. Six months from now, youβll be the SQL expert on your team.