Search everything...

Skill

Performance Optimizer

Optimizes SAP Datasphere performance: diagnoses slow views/queries, timeouts, data flow lags, high CPU/memory; tunes execution plans, indexes, partitions, storage.

performance

database

npx claudepluginhub mariodefelipe/sap-datasphere-plugin-for-claude-cowork

Tool Access

This skill uses the workspace's default tool permissions.

Preview

The Performance Optimizer skill helps you identify and resolve performance issues in SAP Datasphere. Whether your views are running slowly, queries are timing out, or data flows are consuming excessive resources, this skill provides a systematic approach to diagnose and fix performance problems at every layer: views, queries, data flows, and storage.

Supporting Assets

references/diagnostic-procedures.mdreferences/optimization-techniques.md

SKILL.md

Similar Skills

using-git-worktrees

169.2k

Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.

superpowers

subagent-driven-development

169.2k

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

3 files

superpowers

dispatching-parallel-agents

169.2k

Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.

superpowers

Stats

Stars12

Forks3

Last CommitFeb 16, 2026

Actions

View Source View Plugin View on GitHub View README

Performance Optimizer

From datasphere

Optimizes SAP Datasphere performance: diagnoses slow views/queries, timeouts, data flow lags, high CPU/memory; tunes execution plans, indexes, partitions, storage.

performance

database

npx claudepluginhub mariodefelipe/sap-datasphere-plugin-for-claude-cowork

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Supporting Assets

references/diagnostic-procedures.mdreferences/optimization-techniques.md

SKILL.md

Performance Optimizer Skill

Overview

When to Use This Skill

Trigger this skill when you encounter:

Views or queries running slower than expected
Query timeouts or cancellations
High memory or CPU consumption
Data flow execution taking excessive time
Replication flows with poor delta performance
Need to optimize storage usage
Capacity warnings or resource contention alerts
Performance regression after schema changes
Need to benchmark performance improvements

Performance Analysis Approach

1. Identify the Bottleneck

Start by pinpointing where performance degradation occurs:

View Performance Issues:

Access the View Analyzer tool in your view's details
Check execution time trends over the last 7-30 days
Compare execution times across different consumer queries
Identify which views are called most frequently
Note views with long initialization times

Query Performance Issues:

Review query execution statistics in task logs
Use Explain Plan to understand query execution strategy
Check statement logs for query duration and resource usage
Identify queries with full table scans or inefficient joins
Monitor query frequency and timing patterns

Data Flow Performance Issues:

Check data flow execution logs for step-level timing
Monitor initial load duration vs. delta load duration
Review parallelism settings and actual utilization
Check for data quality issues causing processing overhead
Analyze memory allocation and spill events

2. Measure Current Performance

Establish baseline metrics before optimization:

Key Metrics to Capture:

Query execution time (wall-clock and CPU time)
Memory usage (peak and average)
Data volume processed (rows and bytes)
I/O operations and throughput
Number of disk accesses vs. in-memory operations
Index usage patterns
Cache hit rates
Parallelism level achieved
Storage tier distribution

Use the MCP tools to gather metrics:

execute_query: Run diagnostic queries to capture baseline stats
analyze_column_distribution: Understand data skew and selectivity
get_table_schema: Review column types and indexing
get_task_status: Monitor execution metrics

3. Optimize

Apply targeted optimizations based on findings (see following sections).

4. Validate

Re-measure performance after changes:

Compare new metrics against baseline
Verify performance improvement meets targets
Monitor for 7-14 days to ensure consistency
Check for negative side effects on other queries
Document changes for future reference

View Analyzer

The View Analyzer provides insights into view execution performance and resource consumption.

Interpreting View Analyzer Results

Execution Time Breakdown:

Initialization Time: Time to prepare execution plan and allocate resources. High values indicate complex view logic or many dependent views.
Execution Time: Actual query processing time. Dominated by data retrieval and transformation.
Fetch Time: Time to return results to consumer. Long fetch times indicate large result sets or network latency.

Resource Consumption:

Peak Memory: Maximum memory used during execution. Watch for memory spills to disk.
CPU Time: Total CPU cycles consumed. High CPU indicates data processing-heavy operations.
I/O Throughput: Data read from storage. Compare with data returned to find filtering efficiency.

Identifying Expensive Operations

High Memory Consumers:

Joins with large intermediate result sets
Aggregations without pre-filtering
Complex subqueries creating multiple temporary tables
Non-persisted views with repeated calculations

CPU-Intensive Operations:

Complex calculations and expressions
Multiple levels of nested aggregations
String operations on large text fields
Date/time calculations without optimization

I/O-Heavy Operations:

Full table scans on large tables
Inefficient join strategies
Missing indexes on join keys
Reading unnecessary columns

Explain Plan Analysis

The Explain Plan shows exactly how Datasphere executes your query, revealing inefficiencies at each step.

Reading Execution Plans

Plan Structure:

Node Type: SCAN (table access), JOIN, AGGREGATE, SORT, FILTER, etc.
Input Rows: Rows input to this node
Output Rows: Rows produced by this node
Selectivity: Output/Input ratio. Values < 0.1 indicate effective filtering.
Estimated Time: Predicted execution time
Actual Time: Measured execution time (if available)

Spotting Full Table Scans

Indicators:

SCAN node with no index specification
Input rows equal to table row count
High selectivity discrepancy (many input rows, few output rows)

Solutions:

Add index on WHERE clause columns
Add covering index including SELECT columns
Consider partitioning with partition-aware queries
Rewrite to improve filter pushdown

Identifying Join Inefficiencies

Red Flags:

Join producing result set larger than inputs (indicates Cartesian product)
Very high intermediate row counts before filtering
Hash joins with small build tables (should use index joins)
Join order processing large tables first

Solutions:

Reorder joins to process smallest tables first
Add indexes on join keys
Apply filters before joins (reduce probe table size)
Consider materialization of frequently-joined dimensions
Use broadcast tables for small dimension tables

Common Explain Plan Anti-Patterns

Pattern: Multiple Cascading Aggregations

AGGREGATE (GROUP BY col1, col2, col3)
  AGGREGATE (GROUP BY col1, col2)
    AGGREGATE (GROUP BY col1)
      SCAN big_table

Issue: Each aggregation processes full dataset Solution: Combine aggregations or materialize intermediate results

Pattern: Cartesian Join

JOIN (on false condition)
  SCAN table1 (1M rows)
  SCAN table2 (1M rows)

Issue: 1T row intermediate result Solution: Fix join condition, verify ON clause logic

Pattern: Late Filtering

FILTER (where price > 100)
  JOIN table1 (1M rows) with table2 (1M rows)

Issue: Join processes all rows before filtering Solution: Move WHERE clause into source tables (pushdown)

View Optimization Strategies

Persistence Strategy

When to Persist:

Views consumed by many queries (reuse calculation)
Complex views with expensive calculations
Views used in real-time dashboards (reduce latency)
High-frequency views (cache results between accesses)
Views with stable underlying data

When NOT to Persist:

Frequently updated source data (maintenance cost)
Views with specific daily snapshots
Once-per-execution views
Views with minimal consumer count

Persistence Modes:

Runtime: Results cached in memory during session (no latency, high memory)
Disk-Based: Results written to disk (lower memory, I/O cost)
Materialized: Pre-calculated, updated on schedule (consistent performance, staleness risk)

Column Pruning

Remove unnecessary columns from views to reduce data volume:

Strategy:

Audit all consuming queries
Identify columns never referenced
Remove unused columns from source
Test dependent views for impact
Document retained columns and purpose

Benefits:

Reduced memory footprint
Faster data transfer
Smaller index sizes
Improved cache effectiveness

Partition Pruning

Organize data into partitions to reduce scanned volume:

Partition Key Selection:

High-cardinality columns with clear ranges (dates, regions)
Columns frequently in WHERE clauses
Columns supporting most common filters
Avoid partitioning on low-cardinality columns

Partition Strategy:

Time-Based: Year, month, day (best for temporal data)
Range-Based: Numeric ranges (best for numeric dimensions)
Hash-Based: Distribute evenly (when no natural partition key)
List-Based: Specific values (geographic regions, business units)

Partition Pruning in Queries:

SELECT * FROM sales
WHERE year = 2024 AND month IN (1,2,3)
-- Partition pruning eliminates months outside this range

Push-Down Optimization

Move filtering and aggregation to earliest possible stage:

Examples:

-- GOOD: Filters pushed to source tables
SELECT dept, COUNT(*)
FROM employees
WHERE hire_date >= '2023-01-01'  -- Pushed down
GROUP BY dept

-- BAD: Filter applied to pre-aggregated result (wrong answer potential)
SELECT dept, COUNT(*) as emp_count
FROM employees
GROUP BY dept
HAVING CAST(MAX(hire_date) AS DATE) >= '2023-01-01'

Benefits:

Reduces intermediate result sizes
Minimizes data movement
Allows index usage
Reduces memory pressure

Query Optimization Techniques

Index Usage

Index Types:

B-Tree Index: Default, efficient for range and equality queries
Hash Index: For equality lookups only, faster than B-Tree
Bitmap Index: For low-cardinality columns, space-efficient
Covering Index: Includes all columns needed (no table access required)

Index Selection:

Profile query WHERE and JOIN clauses
Create indexes on high-selectivity columns
Use composite indexes for multi-column conditions
Add covering indexes for frequently-run queries
Monitor index usage; drop unused indexes

Query Hints for Index Usage:

SELECT /*+ INDEX(orders order_date_idx) */
  order_id, amount
FROM orders
WHERE order_date = '2024-01-15'

Join Order Optimization

Heuristic: Smallest Table First

-- GOOD: Dimension first (smaller)
SELECT *
FROM departments d
  INNER JOIN employees e ON d.dept_id = e.dept_id
WHERE d.region = 'WEST'

-- BAD: Large fact table first
SELECT *
FROM sales_fact f
  INNER JOIN date_dim d ON f.date_id = d.date_id
WHERE d.year = 2024

Broadcast Strategy: Use broadcast join for small dimensions (< 100MB):

SELECT /*+ BROADCAST(dimension) */
  f.amount, d.category
FROM fact_table f
  INNER JOIN small_dimension d ON f.dim_id = d.id

Aggregation Strategies

Pre-Aggregation Pattern: Materialize frequently-aggregated views:

-- Materialized view updated daily
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT date, product_id, SUM(amount) as total_sales, COUNT(*) as order_count
FROM sales_fact
GROUP BY date, product_id

-- Consumer query now fast
SELECT product_id, SUM(total_sales)
FROM daily_sales_summary
WHERE date >= '2024-01-01'
GROUP BY product_id

Aggregation Push-Down:

-- GOOD: Aggregate at source
SELECT dept, SUM(salary)
FROM employees
WHERE status = 'ACTIVE'
GROUP BY dept

-- BAD: Aggregate after joining to dimensions
SELECT e.dept, SUM(e.salary)
FROM employees e
  INNER JOIN departments d ON e.dept_id = d.dept_id
GROUP BY e.dept

Data Flow Performance

Parallelism Configuration

Parallelism Settings:

DOP (Degree of Parallelism): Number of parallel worker threads
Default: Auto-calculated based on CPU cores and available memory
Manual Override: Set for specific flows with special requirements

Tuning Approach:

Baseline with default parallelism
Monitor CPU utilization (target 70-85%)
Increase DOP if CPU < 50% (data not distributed well)
Decrease DOP if CPU > 90% (contention, thread overhead)
Test with representative data volume

Batch Sizes

Batch Size Impact:

Small batches (1K rows): More overhead, better interactivity
Large batches (100K+ rows): Better throughput, more memory per batch
Optimal: Typically 10K-50K rows depending on row width and memory

Configuration:

Batch Size = Available Memory / (Row Width * 2)

Memory Allocation

Memory Distribution:

Allocate 60% to data buffers
20% to working memory (joins, aggregations)
20% to system overhead and safety margin

Monitoring:

Watch for memory spill messages (indicates insufficient allocation)
Monitor peak memory usage over time
Check for memory leaks in long-running flows

Replication Flow Performance

Initial Load Optimization

Full Table Replication:

Schedule during low-activity windows
Use maximum batch size for the table size
Verify target has sufficient staging space
Monitor progress via status logs
Validate row counts match source

Selective Replication:

-- Replicate only recent data
WHERE created_date >= CURRENT_DATE - 7

Benefits: Reduced network traffic, faster completion

Delta Performance Tuning

Change Data Capture (CDC) Optimization:

Ensure source system has CDC enabled
Configure appropriate change log retention
Set delta frequency (hourly, daily) based on volume
Monitor delta processing time trends
Alert on delta lag exceeding threshold

Performance Issues and Solutions:

Long delta processing: Increase parallelism, check for data anomalies
High memory usage: Reduce batch size, increase delta frequency
Network bottleneck: Verify network bandwidth, consider compression
Target contention: Schedule deltas during off-peak, increase parallel writers

Storage Optimization

Disk vs. In-Memory Strategy

In-Memory Advantages:

Sub-millisecond query latency
Efficient for repeated access
Better for interactive dashboards

In-Memory Disadvantages:

Limited capacity
Higher cost per GB
Unsuitable for very large tables

Disk Advantages:

Unlimited capacity
Cost-effective for archival data
Suitable for infrequently-accessed tables

Disk Disadvantages:

Higher query latency (100ms+)
Suitable for batch reporting

Decision Matrix:

High-frequency, small tables → In-Memory
Large, infrequently-accessed → Disk/Object Store
Medium-sized, moderate frequency → Hybrid (hot set in-memory)

Object Store Tiering (HDLF - Hadoop Distributed File System Layer)

Tier Organization:

Hot Tier: Frequently accessed, in-memory or SSD
Warm Tier: Moderate access, disk-based
Cold Tier: Archive data, object store (S3/blob)

Tiering Policy:

Data age < 30 days → Hot tier
Data age 30-90 days → Warm tier
Data age > 90 days → Cold tier

Benefits:

Optimal cost-performance balance
Automatic promotion/demotion
Transparent to queries
Improved capacity utilization

Monitoring and Alerting

Task Logs

Key Information:

Task execution time (start, end, duration)
Rows processed (read, written, failed)
Memory usage (peak, average)
Data quality metrics (error counts)
Dependencies (parent/child tasks)

Analysis:

Track execution time trends
Identify tasks with increasing duration
Monitor failure rates
Check for resource contention patterns

Statement Logs

Query Metrics:

Query text (full SQL)
Execution time breakdown (parse, optimize, execute)
Rows returned
Memory consumed
CPU time
Table accesses and I/O

Usage:

SELECT query_text, AVG(execution_time_ms) as avg_time, COUNT(*) as frequency
FROM statement_logs
WHERE timestamp >= CURRENT_DATE - 7
GROUP BY query_text
ORDER BY frequency DESC

Capacity Dashboards

Metrics to Monitor:

Overall CPU utilization
Memory consumption by tenant/space
Storage usage by object type
Network I/O
Query queue depth
Active sessions

Alerting Thresholds:

CPU > 80%: Investigate resource contention
Memory > 85%: Add capacity or optimize
Storage > 90%: Archive or clean data
Queue depth > 10: Performance degradation
Session count increasing: Check for runaway queries

Using MCP Tools for Performance Analysis

execute_query

Use to run diagnostic queries and capture explain plans:

execute_query(query="EXPLAIN SELECT ...", explain_type="full")

Returns execution plan with estimated and actual metrics.

analyze_column_distribution

Understand data distribution for optimal indexing:

analyze_column_distribution(table="sales", column="product_id")

Returns histogram of values, identifying skew and cardinality.

get_table_schema

Review table structure, indexes, and partitioning:

get_table_schema(table="orders")

Returns columns, data types, indexes, constraints, and statistics.

get_task_status

Monitor task execution metrics:

get_task_status(task_id="replication_flow_01")

Returns execution time, rows processed, memory used, and state.

Performance Optimization Workflow

Identify Issue: Run diagnostic queries, review task logs
Measure Baseline: Capture execution time, resource usage, row counts
Analyze Root Cause: Review Explain Plan, check for anti-patterns
Create Optimization Plan: Document proposed changes and expected impact
Implement Changes: Add indexes, modify view logic, adjust persistence
Test Thoroughly: Validate correctness and measure performance
Monitor Results: Track metrics for 1-2 weeks post-deployment
Document Learnings: Record successful optimizations for future reference

Best Practices

Always establish baseline metrics before optimization
Make one change at a time to isolate impact
Test in non-production environment first
Monitor for 2+ weeks after changes (catch edge cases)
Document all optimization decisions and rationale
Regular review of slow query logs (at least weekly)
Implement automated alerting on performance regressions
Consider future growth when sizing indexes and partitions
Balance optimization effort against business value
Regularly validate statistics and rebuild indexes

Reference Materials

See reference files for detailed procedures:

references/optimization-techniques.md - Explain Plan reading guide, persistence decision matrix, partitioning strategies, memory management
references/diagnostic-procedures.md - Advanced diagnostic procedures: PlanViz trace generation/analysis, MDS query diagnosis for SAC live connections, HAR file network analysis, tenant memory/CPU profiling, and extracting underlying SQL from Datasphere views

Similar Skills

using-git-worktrees

169.2k

Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.

superpowers

subagent-driven-development

169.2k

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

3 files

superpowers

dispatching-parallel-agents

169.2k

Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.

superpowers

Stats

Stars12

Forks3

Last CommitFeb 16, 2026

Actions

View Source View Plugin View on GitHub View README

Performance Optimizer Skill

Overview

When to Use This Skill

Trigger this skill when you encounter:

Views or queries running slower than expected
Query timeouts or cancellations
High memory or CPU consumption
Data flow execution taking excessive time
Replication flows with poor delta performance
Need to optimize storage usage
Capacity warnings or resource contention alerts
Performance regression after schema changes
Need to benchmark performance improvements

Performance Analysis Approach

1. Identify the Bottleneck

Start by pinpointing where performance degradation occurs:

View Performance Issues:

Access the View Analyzer tool in your view's details
Check execution time trends over the last 7-30 days
Compare execution times across different consumer queries
Identify which views are called most frequently
Note views with long initialization times

Query Performance Issues:

Review query execution statistics in task logs
Use Explain Plan to understand query execution strategy
Check statement logs for query duration and resource usage
Identify queries with full table scans or inefficient joins
Monitor query frequency and timing patterns

Data Flow Performance Issues:

Check data flow execution logs for step-level timing
Monitor initial load duration vs. delta load duration
Review parallelism settings and actual utilization
Check for data quality issues causing processing overhead
Analyze memory allocation and spill events

2. Measure Current Performance

Establish baseline metrics before optimization:

Key Metrics to Capture:

Query execution time (wall-clock and CPU time)
Memory usage (peak and average)
Data volume processed (rows and bytes)
I/O operations and throughput
Number of disk accesses vs. in-memory operations
Index usage patterns
Cache hit rates
Parallelism level achieved
Storage tier distribution

Use the MCP tools to gather metrics:

execute_query: Run diagnostic queries to capture baseline stats
analyze_column_distribution: Understand data skew and selectivity
get_table_schema: Review column types and indexing
get_task_status: Monitor execution metrics

3. Optimize

Apply targeted optimizations based on findings (see following sections).

4. Validate

Re-measure performance after changes:

Compare new metrics against baseline
Verify performance improvement meets targets
Monitor for 7-14 days to ensure consistency
Check for negative side effects on other queries
Document changes for future reference

View Analyzer

The View Analyzer provides insights into view execution performance and resource consumption.

Interpreting View Analyzer Results

Execution Time Breakdown:

Initialization Time: Time to prepare execution plan and allocate resources. High values indicate complex view logic or many dependent views.
Execution Time: Actual query processing time. Dominated by data retrieval and transformation.
Fetch Time: Time to return results to consumer. Long fetch times indicate large result sets or network latency.

Resource Consumption:

Peak Memory: Maximum memory used during execution. Watch for memory spills to disk.
CPU Time: Total CPU cycles consumed. High CPU indicates data processing-heavy operations.
I/O Throughput: Data read from storage. Compare with data returned to find filtering efficiency.

Identifying Expensive Operations

High Memory Consumers:

Joins with large intermediate result sets
Aggregations without pre-filtering
Complex subqueries creating multiple temporary tables
Non-persisted views with repeated calculations

CPU-Intensive Operations:

Complex calculations and expressions
Multiple levels of nested aggregations
String operations on large text fields
Date/time calculations without optimization

I/O-Heavy Operations:

Full table scans on large tables
Inefficient join strategies
Missing indexes on join keys
Reading unnecessary columns

Explain Plan Analysis

The Explain Plan shows exactly how Datasphere executes your query, revealing inefficiencies at each step.

Reading Execution Plans

Plan Structure:

Node Type: SCAN (table access), JOIN, AGGREGATE, SORT, FILTER, etc.
Input Rows: Rows input to this node
Output Rows: Rows produced by this node
Selectivity: Output/Input ratio. Values < 0.1 indicate effective filtering.
Estimated Time: Predicted execution time
Actual Time: Measured execution time (if available)

Spotting Full Table Scans

Indicators:

SCAN node with no index specification
Input rows equal to table row count
High selectivity discrepancy (many input rows, few output rows)

Solutions:

Add index on WHERE clause columns
Add covering index including SELECT columns
Consider partitioning with partition-aware queries
Rewrite to improve filter pushdown

Identifying Join Inefficiencies

Red Flags:

Join producing result set larger than inputs (indicates Cartesian product)
Very high intermediate row counts before filtering
Hash joins with small build tables (should use index joins)
Join order processing large tables first

Solutions:

Reorder joins to process smallest tables first
Add indexes on join keys
Apply filters before joins (reduce probe table size)
Consider materialization of frequently-joined dimensions
Use broadcast tables for small dimension tables

Common Explain Plan Anti-Patterns

Pattern: Multiple Cascading Aggregations

AGGREGATE (GROUP BY col1, col2, col3)
  AGGREGATE (GROUP BY col1, col2)
    AGGREGATE (GROUP BY col1)
      SCAN big_table

Issue: Each aggregation processes full dataset Solution: Combine aggregations or materialize intermediate results

Pattern: Cartesian Join

JOIN (on false condition)
  SCAN table1 (1M rows)
  SCAN table2 (1M rows)

Issue: 1T row intermediate result Solution: Fix join condition, verify ON clause logic

Pattern: Late Filtering

FILTER (where price > 100)
  JOIN table1 (1M rows) with table2 (1M rows)

Issue: Join processes all rows before filtering Solution: Move WHERE clause into source tables (pushdown)

View Optimization Strategies

Persistence Strategy

When to Persist:

Views consumed by many queries (reuse calculation)
Complex views with expensive calculations
Views used in real-time dashboards (reduce latency)
High-frequency views (cache results between accesses)
Views with stable underlying data

When NOT to Persist:

Frequently updated source data (maintenance cost)
Views with specific daily snapshots
Once-per-execution views
Views with minimal consumer count

Persistence Modes:

Runtime: Results cached in memory during session (no latency, high memory)
Disk-Based: Results written to disk (lower memory, I/O cost)
Materialized: Pre-calculated, updated on schedule (consistent performance, staleness risk)

Column Pruning

Remove unnecessary columns from views to reduce data volume:

Strategy:

Audit all consuming queries
Identify columns never referenced
Remove unused columns from source
Test dependent views for impact
Document retained columns and purpose

Benefits:

Reduced memory footprint
Faster data transfer
Smaller index sizes
Improved cache effectiveness

Partition Pruning

Organize data into partitions to reduce scanned volume:

Partition Key Selection:

High-cardinality columns with clear ranges (dates, regions)
Columns frequently in WHERE clauses
Columns supporting most common filters
Avoid partitioning on low-cardinality columns

Partition Strategy:

Time-Based: Year, month, day (best for temporal data)
Range-Based: Numeric ranges (best for numeric dimensions)
Hash-Based: Distribute evenly (when no natural partition key)
List-Based: Specific values (geographic regions, business units)

Partition Pruning in Queries:

SELECT * FROM sales
WHERE year = 2024 AND month IN (1,2,3)
-- Partition pruning eliminates months outside this range

Push-Down Optimization

Move filtering and aggregation to earliest possible stage:

Examples:

-- GOOD: Filters pushed to source tables
SELECT dept, COUNT(*)
FROM employees
WHERE hire_date >= '2023-01-01'  -- Pushed down
GROUP BY dept

-- BAD: Filter applied to pre-aggregated result (wrong answer potential)
SELECT dept, COUNT(*) as emp_count
FROM employees
GROUP BY dept
HAVING CAST(MAX(hire_date) AS DATE) >= '2023-01-01'

Benefits:

Reduces intermediate result sizes
Minimizes data movement
Allows index usage
Reduces memory pressure

Query Optimization Techniques

Index Usage

Index Types:

B-Tree Index: Default, efficient for range and equality queries
Hash Index: For equality lookups only, faster than B-Tree
Bitmap Index: For low-cardinality columns, space-efficient
Covering Index: Includes all columns needed (no table access required)

Index Selection:

Profile query WHERE and JOIN clauses
Create indexes on high-selectivity columns
Use composite indexes for multi-column conditions
Add covering indexes for frequently-run queries
Monitor index usage; drop unused indexes

Query Hints for Index Usage:

SELECT /*+ INDEX(orders order_date_idx) */
  order_id, amount
FROM orders
WHERE order_date = '2024-01-15'

Join Order Optimization

Heuristic: Smallest Table First

-- GOOD: Dimension first (smaller)
SELECT *
FROM departments d
  INNER JOIN employees e ON d.dept_id = e.dept_id
WHERE d.region = 'WEST'

-- BAD: Large fact table first
SELECT *
FROM sales_fact f
  INNER JOIN date_dim d ON f.date_id = d.date_id
WHERE d.year = 2024

Broadcast Strategy: Use broadcast join for small dimensions (< 100MB):

SELECT /*+ BROADCAST(dimension) */
  f.amount, d.category
FROM fact_table f
  INNER JOIN small_dimension d ON f.dim_id = d.id

Aggregation Strategies

Pre-Aggregation Pattern: Materialize frequently-aggregated views:

-- Materialized view updated daily
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT date, product_id, SUM(amount) as total_sales, COUNT(*) as order_count
FROM sales_fact
GROUP BY date, product_id

-- Consumer query now fast
SELECT product_id, SUM(total_sales)
FROM daily_sales_summary
WHERE date >= '2024-01-01'
GROUP BY product_id

Aggregation Push-Down:

-- GOOD: Aggregate at source
SELECT dept, SUM(salary)
FROM employees
WHERE status = 'ACTIVE'
GROUP BY dept

-- BAD: Aggregate after joining to dimensions
SELECT e.dept, SUM(e.salary)
FROM employees e
  INNER JOIN departments d ON e.dept_id = d.dept_id
GROUP BY e.dept

Data Flow Performance

Parallelism Configuration

Parallelism Settings:

DOP (Degree of Parallelism): Number of parallel worker threads
Default: Auto-calculated based on CPU cores and available memory
Manual Override: Set for specific flows with special requirements

Tuning Approach:

Baseline with default parallelism
Monitor CPU utilization (target 70-85%)
Increase DOP if CPU < 50% (data not distributed well)
Decrease DOP if CPU > 90% (contention, thread overhead)
Test with representative data volume

Batch Sizes

Batch Size Impact:

Small batches (1K rows): More overhead, better interactivity
Large batches (100K+ rows): Better throughput, more memory per batch
Optimal: Typically 10K-50K rows depending on row width and memory

Configuration:

Batch Size = Available Memory / (Row Width * 2)

Memory Allocation

Memory Distribution:

Allocate 60% to data buffers
20% to working memory (joins, aggregations)
20% to system overhead and safety margin

Monitoring:

Watch for memory spill messages (indicates insufficient allocation)
Monitor peak memory usage over time
Check for memory leaks in long-running flows

Replication Flow Performance

Initial Load Optimization

Full Table Replication:

Schedule during low-activity windows
Use maximum batch size for the table size
Verify target has sufficient staging space
Monitor progress via status logs
Validate row counts match source

Selective Replication:

-- Replicate only recent data
WHERE created_date >= CURRENT_DATE - 7

Benefits: Reduced network traffic, faster completion

Delta Performance Tuning

Change Data Capture (CDC) Optimization:

Ensure source system has CDC enabled
Configure appropriate change log retention
Set delta frequency (hourly, daily) based on volume
Monitor delta processing time trends
Alert on delta lag exceeding threshold

Performance Issues and Solutions:

Long delta processing: Increase parallelism, check for data anomalies
High memory usage: Reduce batch size, increase delta frequency
Network bottleneck: Verify network bandwidth, consider compression
Target contention: Schedule deltas during off-peak, increase parallel writers

Storage Optimization

Disk vs. In-Memory Strategy

In-Memory Advantages:

Sub-millisecond query latency
Efficient for repeated access
Better for interactive dashboards

In-Memory Disadvantages:

Limited capacity
Higher cost per GB
Unsuitable for very large tables

Disk Advantages:

Unlimited capacity
Cost-effective for archival data
Suitable for infrequently-accessed tables

Disk Disadvantages:

Higher query latency (100ms+)
Suitable for batch reporting

Decision Matrix:

High-frequency, small tables → In-Memory
Large, infrequently-accessed → Disk/Object Store
Medium-sized, moderate frequency → Hybrid (hot set in-memory)

Object Store Tiering (HDLF - Hadoop Distributed File System Layer)

Tier Organization:

Hot Tier: Frequently accessed, in-memory or SSD
Warm Tier: Moderate access, disk-based
Cold Tier: Archive data, object store (S3/blob)

Tiering Policy:

Data age < 30 days → Hot tier
Data age 30-90 days → Warm tier
Data age > 90 days → Cold tier

Benefits:

Optimal cost-performance balance
Automatic promotion/demotion
Transparent to queries
Improved capacity utilization

Monitoring and Alerting

Task Logs

Key Information:

Task execution time (start, end, duration)
Rows processed (read, written, failed)
Memory usage (peak, average)
Data quality metrics (error counts)
Dependencies (parent/child tasks)

Analysis:

Track execution time trends
Identify tasks with increasing duration
Monitor failure rates
Check for resource contention patterns

Statement Logs

Query Metrics:

Query text (full SQL)
Execution time breakdown (parse, optimize, execute)
Rows returned
Memory consumed
CPU time
Table accesses and I/O

Usage:

SELECT query_text, AVG(execution_time_ms) as avg_time, COUNT(*) as frequency
FROM statement_logs
WHERE timestamp >= CURRENT_DATE - 7
GROUP BY query_text
ORDER BY frequency DESC

Capacity Dashboards

Metrics to Monitor:

Overall CPU utilization
Memory consumption by tenant/space
Storage usage by object type
Network I/O
Query queue depth
Active sessions

Alerting Thresholds:

CPU > 80%: Investigate resource contention
Memory > 85%: Add capacity or optimize
Storage > 90%: Archive or clean data
Queue depth > 10: Performance degradation
Session count increasing: Check for runaway queries

Using MCP Tools for Performance Analysis

execute_query

Use to run diagnostic queries and capture explain plans:

execute_query(query="EXPLAIN SELECT ...", explain_type="full")

Returns execution plan with estimated and actual metrics.

analyze_column_distribution

Understand data distribution for optimal indexing:

analyze_column_distribution(table="sales", column="product_id")

Returns histogram of values, identifying skew and cardinality.

get_table_schema

Review table structure, indexes, and partitioning:

get_table_schema(table="orders")

Returns columns, data types, indexes, constraints, and statistics.

get_task_status

Monitor task execution metrics:

get_task_status(task_id="replication_flow_01")

Returns execution time, rows processed, memory used, and state.

Performance Optimization Workflow

Identify Issue: Run diagnostic queries, review task logs
Measure Baseline: Capture execution time, resource usage, row counts
Analyze Root Cause: Review Explain Plan, check for anti-patterns
Create Optimization Plan: Document proposed changes and expected impact
Implement Changes: Add indexes, modify view logic, adjust persistence
Test Thoroughly: Validate correctness and measure performance
Monitor Results: Track metrics for 1-2 weeks post-deployment
Document Learnings: Record successful optimizations for future reference

Best Practices

Always establish baseline metrics before optimization
Make one change at a time to isolate impact
Test in non-production environment first
Monitor for 2+ weeks after changes (catch edge cases)
Document all optimization decisions and rationale
Regular review of slow query logs (at least weekly)
Implement automated alerting on performance regressions
Consider future growth when sizing indexes and partitions
Balance optimization effort against business value
Regularly validate statistics and rebuild indexes

Reference Materials

See reference files for detailed procedures:

references/optimization-techniques.md - Explain Plan reading guide, persistence decision matrix, partitioning strategies, memory management
references/diagnostic-procedures.md - Advanced diagnostic procedures: PlanViz trace generation/analysis, MDS query diagnosis for SAC live connections, HAR file network analysis, tenant memory/CPU profiling, and extracting underlying SQL from Datasphere views