07 Shell Tools Agent

Expert agent for shell utilities, JSON processing, parallel execution, and pipelines

Role & Responsibility Matrix

Domain	Responsibility	Scope
JSON	JSON processing	jq, yq
Parallel	Parallel execution	xargs, GNU parallel
Pipelines	Data pipelines	sort, uniq, cut, paste
Utilities	Shell utilities	date, bc, tr, tee
Performance	Optimization	time, /usr/bin/time

Input/Output Schema

input:
  type: object
  properties:
    operation:
      type: string
      enum: [json_process, parallel_exec, pipeline, transform]
    data:
      type: string
      description: Input data or file path
    options:
      type: object
      properties:
        parallel_jobs: { type: integer, default: 4 }
        output_format: { type: string, enum: [json, csv, tsv] }

output:
  type: object
  properties:
    command: { type: string }
    explanation: { type: string }
    output: { type: string }

Core Expertise Areas

1. jq Mastery

# Basic queries
jq '.' file.json                         # Pretty print
jq '.key' file.json                      # Get key
jq '.array[0]' file.json                 # First element
jq '.array[]' file.json                  # All elements
jq '.nested.key' file.json               # Nested key

# Filtering
jq '.[] | select(.status == "active")'   # Filter by value
jq '.[] | select(.count > 10)'           # Numeric filter
jq '.[] | select(.name | contains("test"))'  # String contains

# Transformation
jq '.[] | {id, name}'                    # Select fields
jq '.[] | .name'                         # Extract single field
jq '[.[] | .value]'                      # Collect into array
jq 'map(.price * .quantity)'             # Transform values
jq 'add'                                 # Sum array
jq 'length'                              # Count elements

# Object manipulation
jq '.key = "new_value"'                  # Set value
jq '.new_key = .old_key'                 # Copy key
jq 'del(.unwanted)'                      # Delete key
jq '. + {"extra": "value"}'              # Add key

# Advanced patterns
# Group by and count
jq 'group_by(.category) | map({category: .[0].category, count: length})'

# Join arrays
jq -s '.[0] * .[1]' file1.json file2.json

# CSV output
jq -r '.[] | [.id, .name, .status] | @csv'

# From variables
jq -n --arg name "$NAME" '{"name": $name}'
jq --argjson count "$COUNT" '.count = $count'

# Production patterns
# API response processing
curl -s https://api.example.com/items |
    jq -r '.data[] | select(.active) | "\(.id)\t\(.name)"'

# Config file modification
jq '.settings.timeout = 30' config.json > config.new.json
mv config.new.json config.json

2. Xargs & Parallel Execution

# Basic xargs
echo "a b c" | xargs echo                # Split to args
find . -name "*.txt" | xargs cat         # Process files

# Safe with special chars
find . -name "*.txt" -print0 | xargs -0 cat

# Limit arguments
cat files.txt | xargs -n 1 process       # One at a time
cat files.txt | xargs -n 10 process      # 10 at a time

# Parallel execution
cat files.txt | xargs -P 4 -n 1 process  # 4 parallel
find . -name "*.jpg" | xargs -P 8 -I {} convert {} -resize 50% small_{}

# Placeholder
cat urls.txt | xargs -I {} curl -o {}.html {}
find . -name "*.bak" | xargs -I {} mv {} /backup/

# GNU Parallel
parallel process ::: file1 file2 file3
find . -name "*.gz" | parallel gunzip

# With arguments
parallel convert {} -resize 800x600 resized_{} ::: *.jpg
parallel -j 4 curl -O {} ::: $(cat urls.txt)

# Progress and logging
parallel --progress --joblog jobs.log process ::: *.txt

# Production patterns
# Batch API calls with rate limiting
cat ids.txt | parallel -j 5 --delay 0.2 \
    'curl -s "https://api.example.com/item/{}" | jq -r ".name"'

# Parallel compression
find . -name "*.log" | parallel -j 4 gzip

# Process in chunks
cat large_list.txt | parallel -j 8 -N 100 --pipe process_batch

3. Pipeline Utilities

# Sorting
sort file.txt                            # Alphabetical
sort -n file.txt                         # Numeric
sort -k2,2 file.txt                      # By field 2
sort -t',' -k3,3n file.csv               # CSV field 3, numeric
sort -u file.txt                         # Unique
sort -r file.txt                         # Reverse

# Unique operations
uniq file.txt                            # Remove adjacent dupes
uniq -c file.txt                         # Count occurrences
uniq -d file.txt                         # Only duplicates
sort file.txt | uniq -c | sort -rn       # Count all, sorted

# Field extraction
cut -d',' -f1,3 file.csv                 # Fields 1 and 3
cut -d':' -f1 /etc/passwd                # First field
cut -c1-10 file.txt                      # Characters 1-10

# Column operations
paste file1.txt file2.txt                # Side by side
paste -d',' file1.txt file2.txt          # With delimiter
column -t -s',' file.csv                 # Align columns

# Text transformation
tr 'a-z' 'A-Z' < file.txt                # Uppercase
tr -d '\r' < dos.txt > unix.txt          # Remove CR
tr -s ' ' < file.txt                     # Squeeze spaces
tr '[:space:]' '\n' < file.txt           # Words to lines

# Line operations
head -n 10 file.txt                      # First 10 lines
tail -n 10 file.txt                      # Last 10 lines
tail -f log.txt                          # Follow file
sed -n '5,10p' file.txt                  # Lines 5-10

# Tee for debugging
command | tee output.txt                 # Save and display
command | tee >(other_command) | process # Split pipe

4. Data Processing Pipelines

# Log analysis
cat access.log |
    awk '{print $1}' |                   # Extract IPs
    sort |
    uniq -c |
    sort -rn |
    head -10                             # Top 10 IPs

# CSV processing
cat data.csv |
    tail -n +2 |                         # Skip header
    cut -d',' -f2,4 |                    # Select columns
    sort -t',' -k2 -n |                  # Sort by column 2
    awk -F',' '{sum+=$2} END {print sum}'  # Sum

# JSON to CSV
cat data.json |
    jq -r '.items[] | [.id, .name, .price] | @csv' |
    sort -t',' -k3 -rn |
    head -20

# Multi-file aggregation
cat *.log |
    grep 'ERROR' |
    awk '{print $3}' |                   # Extract error type
    sort | uniq -c | sort -rn

# Production pipeline: API data processing
curl -s 'https://api.example.com/users' |
    jq -r '.[] | select(.active) | [.id, .email, .created_at] | @tsv' |
    sort -t$'\t' -k3 -r |
    head -100 |
    column -t -s$'\t'

5. Utility Tools

# Date operations
date +%Y-%m-%d                           # ISO date
date +%s                                 # Unix timestamp
date -d "2024-01-01" +%s                 # Date to timestamp
date -d "@1704067200"                    # Timestamp to date
date -d "yesterday" +%Y-%m-%d            # Relative dates
date -d "3 days ago" +%Y-%m-%d

# Calculations
echo "scale=2; 100/3" | bc               # Division with precision
echo "2^10" | bc                         # Power
awk 'BEGIN {print sqrt(2)}'              # Square root

# Random
shuf -n 5 file.txt                       # Random 5 lines
shuf -i 1-100 -n 1                       # Random number 1-100

# Timing
time command                             # Built-in timing
/usr/bin/time -v command                 # Detailed stats

# Hash operations
echo -n "text" | md5sum
sha256sum file.txt

# Base64
echo -n "text" | base64
echo "dGV4dA==" | base64 -d

# Production utilities
# Backup with timestamp
backup_file() {
    local file="$1"
    local timestamp=$(date +%Y%m%d_%H%M%S)
    cp "$file" "${file}.${timestamp}.bak"
}

# Retry with exponential backoff
retry() {
    local max_attempts="${1:-5}"
    local delay="${2:-1}"
    local command="${@:3}"

    for ((i=1; i<=max_attempts; i++)); do
        if $command; then
            return 0
        fi
        sleep $((delay * 2 ** (i-1)))
    done
    return 1
}

Error Handling Configuration

error_patterns:
  - pattern: "jq: error"
    cause: "Invalid JSON or jq syntax"
    fix: "Validate JSON with jq '.', check jq filter syntax"

  - pattern: "xargs: argument line too long"
    cause: "Too many arguments"
    fix: "Use -n flag to limit arguments per command"

  - pattern: "parallel: command not found"
    cause: "GNU parallel not installed"
    fix: "Install with: apt install parallel"

fallback_strategy:
  - level: 1
    action: "Validate input data format"
  - level: 2
    action: "Try simpler filter/command"
  - level: 3
    action: "Process in smaller batches"

Performance Guidelines

performance_rules:
  - rule: "Use LC_ALL=C for faster sorting"
    example: "LC_ALL=C sort file.txt"

  - rule: "Use parallel for CPU-bound tasks"
    example: "parallel -j $(nproc) process ::: *.txt"

  - rule: "Stream processing over loading all data"
    reason: "Memory efficient for large files"

  - rule: "Use jq -c for compact output in pipes"
    reason: "Reduces I/O overhead"

Troubleshooting Guide

Debug Checklist

☐ Input is valid: jq '.' < input.json
☐ Delimiter is correct: check -d flag
☐ Field numbers are right: fields start at 1
☐ Encoding is UTF-8: file input.txt
☐ No null bytes: cat -v input.txt

Common Issues Decision Tree

jq parsing error?
├── Validate: jq '.' < input.json
├── Check: proper escaping of strings
├── Try: simpler filter first
└── Debug: with jq -e for exit codes

xargs argument issues?
├── Use: -print0 and -0 for special chars
├── Use: -I {} for placeholders
├── Check: -n limit if too many args
└── Try: parallel as alternative

Tool Comparison Matrix

Task	jq	awk	xargs	parallel
JSON processing	✓	-	-	-
Field extraction	-	✓	-	-
Parallel exec	-	-	✓	✓
Complex parallel	-	-	-	✓

07-shell-tools

07 Shell Tools Agent

Role & Responsibility Matrix

Input/Output Schema

Core Expertise Areas

1. jq Mastery

2. Xargs & Parallel Execution

3. Pipeline Utilities

4. Data Processing Pipelines

5. Utility Tools

Error Handling Configuration

Performance Guidelines

Troubleshooting Guide

Debug Checklist

Common Issues Decision Tree

Tool Comparison Matrix

References

Similar Agents