Skip to main content

Chapter 7 · Performance Optimization

Performance work is a loop, not a finish line. Each tuning pass starts by observing the workload, forming a hypothesis, testing a change, and measuring again. This chapter mirrors the style of Chapters 4 and 5: we anchor the discussion in a shared dataset, we run multiple variations of every tuning technique, and we surface the before/after impact so you can see what actually changed deep inside the engine.


7.0. Lab Dataset and Baseline Metrics

Unless a scenario explicitly changes data or configuration, assume the system resets to the state below.

Customers

CustomerIDRegionSegmentCreatedAt
1001NA-WestPremium2021-01-04
1002NA-EastStandard2021-02-18
1003EMEAPremium2021-03-22
1004APACStartup2021-04-09
1005NA-WestStandard2021-05-15

Orders (≈ 5 million rows in the simulated workload)

OrderIDCustomerIDOrderedAtStatusTotalAmount
900000110012024-06-01 10:14:03SHIPPED185.40
900000210022024-06-01 10:15:11PENDING42.50
900000310052024-06-01 10:15:36SHIPPED79.99
900000410032024-06-01 10:16:02CANCELLED349.00

OrderLineItems (≈ 20 million rows)

LineIDOrderIDProductIDQtyUnitPrice
7000019000001P-4432292.70
7000029000001P-7788164.00

SlowQueryLog (captured over 24 hours)

QueryHashSampleTextExecCountAvgTimeMsMaxTimeMs
H1SELECT * FROM Orders WHERE CustomerID = ?48,51283.40612.52
H2SELECT SUM(TotalAmount) FROM Orders WHERE Status=?6,14365.18190.04
H3SELECT … FROM Orders JOIN OrderLineItems …2,901214.72841.88

Baseline metrics (no tuning applied yet)

MetricValue
P95 latency for Orders lookup612 ms
Total logical reads per minute1.8 million
Buffer cache hit ratio92%
Write journal stall rate0.8%

We will gradually improve these numbers and document the internal effects (access path, logical reads, CPU time) for every change.


7.1. Index Tuning in Practice

Indexes shape the access path. Every variation below shows the plan summary before and after a change so you can see how the optimizer responds.

Variation A – Missing index vs selective single-column index

Query:

SELECT OrderID, OrderedAt, TotalAmount
FROM Orders
WHERE CustomerID = 1003;
StateAccess PathRows ScannedAvgTimeMsNotes
BeforeSeq Scan on Orders5,000,00083.4Touches entire table, high I/O
After CREATE INDEX idx_orders_customer ON Orders(CustomerID);Index Seek + Key Lookup83.1Optimizer jumps directly to matching customer rows

Inside the buffer cache after indexing

MetricBeforeAfter
Logical reads per execution18,45032
Physical reads per execution2201
CPU time per execution (ms)422

The huge drop comes from eliminating the full table scan. Notice the remaining key lookup—each matching row still hops back to the clustered data pages. We remove that next.

Variation B – Turning the index into a covering index

Extended query:

SELECT OrderID, OrderedAt, TotalAmount
FROM Orders
WHERE CustomerID = 1003;
CREATE INDEX idx_orders_customer_cover
ON Orders(CustomerID)
INCLUDE (OrderedAt, TotalAmount);
DROP INDEX idx_orders_customer;
StateAccess PathKey LookupsAvgTimeMsNotes
With non-covering indexIndex Seek + Lookup33.1Each lookup fetches data page
With covering indexIndex-only Seek01.7Plan never touches base table

Inside the storage engine: the seek reads only leaf pages of the nonclustered index. Because the requested columns live in the INCLUDE list, no extra lookups are needed.

Variation C – Composite index order matters (left-prefix rule)

Typical workload: filter by Status, then order by OrderedAt DESC.

-- Option 1 (incorrect order)
CREATE INDEX idx_orders_status_amount ON Orders(Status, TotalAmount);

-- Option 2 (matches workload)
CREATE INDEX idx_orders_status_orderedat ON Orders(Status, OrderedAt DESC);
IndexFilter + ORDER BY supportPlan ShapeAvgTimeMsObservation
(Status, TotalAmount)Poor (OrderedAt missing)Index Scan + Sort74.2Sort spills to tempdb
(Status, OrderedAt)ExcellentIndex Seek, no Sort8.6Plan streams rows already sorted

Because the leading column in an index is the entry point, reordering columns to match filter → sort order removes the expensive sort step.

Variation D – Partial/filtered index

Scenario: 90% of historical orders are closed; only recent Status = 'PENDING' orders need fast access.

CREATE INDEX idx_orders_pending_recent
ON Orders(OrderedAt DESC)
WHERE Status = 'PENDING' AND OrderedAt >= CURRENT_DATE - INTERVAL '30 days';
QueryAccess PathPages TouchedAvgTimeMs
Fetch recent pending ordersBitmap Heap Scan (before)120,00096.0
Partial Index Seek (after)1,2404.2

Filtered indexes dramatically reduce index size and maintenance overhead while still accelerating the hot subset of data.

Variation E – When an index hurts

Adding an index is not always a win. On a heavy-write table, every insert must maintain the index.

OperationNo Extra IndexWith Extra IndexDelta
Batch insert 10k orders1.2 s2.8 s+133% latency
Transaction log growth140 MB280 MB+100%

Rule of thumb: Keep only the indexes that demonstrably reduce read cost by more than they increase write cost.


7.2. Query Rewrites and Execution Plans

The optimizer chooses a plan based on the SQL you hand it. Small syntactic tweaks can unlock better plans. We’ll inspect EXPLAIN summaries (simplified) and key metrics.

Variation A – SELECT * vs targeted projection

-- Original
SELECT * FROM Orders WHERE CustomerID = 1003;

-- Rewrite
SELECT OrderID, OrderedAt, TotalAmount
FROM Orders
WHERE CustomerID = 1003;
MetricSELECT *Targeted columnsExplanation
Projected columns123Narrow projection lowers memory pressure
Key lookups30 (with covering index)Fewer columns mean covering index is feasible
Avg network payload9.5 KB2.1 KBLess data crossing the wire

The rewrite plus covering index shrinks result size and eliminates extra lookups, aligning perfectly with Variation B above.

Variation B – Making predicates SARGable

-- Original (non-SARGable)
SELECT OrderID
FROM Orders
WHERE DATE(OrderedAt) = CURRENT_DATE;

-- Rewrite
SELECT OrderID
FROM Orders
WHERE OrderedAt >= CURRENT_DATE
AND OrderedAt < CURRENT_DATE + INTERVAL '1 day';
Plan MetricOriginalRewritten
Access pathSeq ScanIndex Seek on OrderedAt
Rows scanned5,000,00018,240
AvgTimeMs148.65.4

Moving the function off the column allows the optimizer to use the time-based index. If the database supports it, a functional index on DATE(OrderedAt) would also work, but the predicate rewrite keeps the SQL portable.

Variation C – Replacing correlated subquery with JOIN

-- Original
SELECT o.OrderID
FROM Orders o
WHERE TotalAmount > (
SELECT AVG(TotalAmount)
FROM Orders
WHERE CustomerID = o.CustomerID
);

-- Rewrite
WITH CustomerAverages AS (
SELECT CustomerID, AVG(TotalAmount) AS AvgAmount
FROM Orders
GROUP BY CustomerID
)
SELECT o.OrderID
FROM Orders AS o
JOIN CustomerAverages AS ca
ON ca.CustomerID = o.CustomerID
WHERE o.TotalAmount > ca.AvgAmount;
MetricCorrelated SubqueryCTE + Join
Executions of inner query5,000,0001
Logical reads92 million1.5 million
AvgTimeMs84263

Variation D – Splitting troublesome OR predicates

-- Original
SELECT OrderID
FROM Orders
WHERE Status = 'PENDING' OR Region = 'EMEA';

-- Rewrite
SELECT OrderID
FROM Orders
WHERE Status = 'PENDING'
UNION ALL
SELECT OrderID
FROM Orders
WHERE Region = 'EMEA' AND Status <> 'PENDING';
Plan MetricOriginalRewritten
Access pathSeq ScanTwo index seeks
AvgTimeMs11618
Duplicate handlingImplicitExplicit (second query filters duplicates)

Splitting the predicate lets the optimizer exploit the Status and Region indexes independently.

Variation E – Window function versus self-join

-- Ancient pattern using self-join
SELECT a.OrderID
FROM Orders a
LEFT JOIN Orders b
ON a.CustomerID = b.CustomerID
AND a.OrderedAt < b.OrderedAt
WHERE b.OrderID IS NULL;

-- Modern window function
SELECT OrderID
FROM (
SELECT OrderID,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderedAt DESC) AS rn
FROM Orders
) AS ranked
WHERE rn = 1;
MetricSelf-joinWindow function
Intermediate rows25 million5 million
Temp space used4.2 GB480 MB
AvgTimeMs41062

7.3. Statistics, Plans, and Maintenance Windows

Indexes are only as good as the statistics that feed the optimizer. When those stats drift, good indexes look invisible.

Variation A – Updating statistics

ActionP95 latency (ms)Estimated rows accuracy
Before ANALYZE Orders;118Estimates off by 8×
After ANALYZE Orders;42Estimates within 1.2×

Fresh statistics reduce cardinality guesswork, preventing plan regressions.

Variation B – Rebuild vs Reorganize

IndexFragmentationOperationDurationLog GrowthWhen to choose
idx_orders_customer_cover38%ALTER INDEX REBUILD3m 12s1.4 GBUse when fragmentation > 30%
idx_orders_status_orderedat12%ALTER INDEX REORGANIZE42s180 MBUse when 5–30%

Rebuild rewrites the index from scratch (heavy but thorough). Reorganize defragments in-place (lighter but slower to converge).

Variation C – Plan cache inspection

QueryHashCached PlansAvg CPU msRecompile Reason
H1118Stable
H2485Parameter sniffing
H32202Statistics changed

If a query has wildly different parameter values, consider parameter-sensitive plan techniques: OPTIMIZE FOR hints, plan guides, or rewriting to use recompile.


7.4. Workload Governance and Observability

Connection pooling vs ad-hoc bursts

ScenarioActive sessionsCPU saturationWait type
Without pooling12088%THREADPOOL
With 32-pool limit3455%PAGEIOLATCH

Fewer, longer-lived connections reduce context switching and improve plan reuse.

Baseline monitoring checklist

Time RangeKey MetricThresholdAction
Every 5 minTop wait eventsLCK_ > 5%Inspect blocking tree
HourlyBuffer cache hit ratio< 90%Investigate index-only scans or cache size
NightlyAutovacuum / autostats lag> 2 hrsTrigger manual maintenance
WeeklySlowQueryLog deltas> 10% growthRe-run tuning playbook

7.5. Putting It Together

Tuning MovePrimary BenefitSecondary CostWhen to Roll Back
Add covering indexSlashes logical readsHigher write amplificationWrite-heavy bursts slow noticeably
Rewrite correlated subqueryReduces CPU and temp I/OMore complex SQLQuery plan still scans millions of rows
Update statisticsFixes cardinality estimatesTemporary CPU spikeProduction window too tight
Limit pool sizeStabilizes resource usageSlightly higher queue waitQueue wait > SLA threshold

Remember: Performance tuning is applied science. Measure first, change one thing at a time, and keep a rollback plan for each experiment.