Production-grade text processing specialist - grep, sed, awk, regex mastery
Executes advanced text processing using grep, sed, and awk for pattern matching and transformation.
/plugin marketplace add pluginagentmarketplace/custom-plugin-bash-shell/plugin install custom-plugin-bash-shell@pluginagentmarketplace-bash-shellsonnetExpert agent for shell-based text processing with grep, sed, awk, and regex
| Domain | Responsibility | Scope |
|---|---|---|
| Pattern Matching | regex expertise | BRE, ERE, PCRE |
| Search | Text search operations | grep, ripgrep, ag |
| Transform | Text transformation | sed, tr, cut |
| Analysis | Data extraction/analysis | awk, column |
| Pipelines | Multi-tool pipelines | Composable flows |
input:
type: object
properties:
operation:
type: string
enum: [search, replace, extract, transform, analyze]
pattern:
type: string
description: Regex or literal pattern
input_source:
type: string
description: File path or stdin indicator
options:
type: object
properties:
case_insensitive: { type: boolean, default: false }
regex_flavor: { type: string, enum: [BRE, ERE, PCRE] }
inplace: { type: boolean, default: false }
output:
type: object
properties:
command: { type: string }
explanation: { type: string }
alternatives: { type: array }
# Basic patterns
grep 'pattern' file.txt # BRE
grep -E 'pattern|alt' file.txt # ERE (extended)
grep -P 'pattern(?=lookahead)' file # PCRE (Perl)
# Common options
grep -i 'case insensitive' # Ignore case
grep -v 'exclude pattern' # Invert match
grep -w 'whole word only' # Word boundary
grep -n 'show line numbers' # Line numbers
grep -c 'count matches' # Count only
grep -l 'list files only' # File list
grep -r 'recursive search' ./ # Recursive
grep -A3 -B2 'context lines' # Context
# Production patterns
grep -rn --include='*.py' 'def ' ./ # Python functions
grep -E '^[0-9]{3}-[0-9]{4}$' phones # Phone numbers
grep -oP '(?<=email:)\S+' data.txt # Extract emails
# Performance: ripgrep alternative
rg 'pattern' --type py # Fast, respects .gitignore
# Substitution patterns
sed 's/old/new/' file # First occurrence
sed 's/old/new/g' file # Global (all)
sed 's/old/new/gi' file # Global, case-insensitive
sed -i 's/old/new/g' file # In-place edit
sed -i.bak 's/old/new/g' file # Backup before edit
# Address ranges
sed '5s/old/new/' file # Line 5 only
sed '5,10s/old/new/' file # Lines 5-10
sed '/start/,/end/s/old/new/' file # Between patterns
# Advanced operations
sed -n 's/.*name="\([^"]*\)".*/\1/p' # Extract with capture
sed '/^$/d' file # Delete empty lines
sed 's/[[:space:]]*$//' file # Trim trailing spaces
# Field processing
awk '{print $1}' file # First field
awk '{print $NF}' file # Last field
awk -F: '{print $1}' /etc/passwd # Custom delimiter
awk -F, '{print $2,$3}' data.csv # CSV fields
# Patterns and conditions
awk '/pattern/' file # Lines matching
awk '$3 > 100' file # Numeric condition
awk 'NR==5' file # Specific line
# Production patterns
awk '
BEGIN { FS=","; OFS="\t" }
NR==1 { print; next }
$3 ~ /ERROR/ { errors++ }
{ total++ }
END {
printf "Errors: %d/%d (%.2f%%)\n",
errors, total, errors*100/total
}
' access.log
# Character classes
[abc] # Any of a, b, c
[^abc] # Not a, b, or c
[a-z] # Range a to z
[[:alpha:]] # Any letter
[[:digit:]] # Any digit
. # Any character
# Anchors
^pattern # Start of line
pattern$ # End of line
\bword\b # Word boundary (ERE)
# Quantifiers
* # Zero or more
+ # One or more (ERE)
? # Zero or one (ERE)
{n} # Exactly n
{n,m} # Between n and m
# Groups
\(group\) # Capture group (BRE)
(group) # Capture group (ERE)
\1 # Back reference
# Log analysis pipeline
cat access.log |
grep -v '^#' | # Remove comments
awk '{print $1}' | # Extract IPs
sort | uniq -c | # Count occurrences
sort -rn | # Sort by count
head -10 # Top 10
# CSV processing pipeline
cat data.csv |
tail -n +2 | # Skip header
cut -d',' -f2,4 | # Select columns
grep -v '^$' | # Remove empty
column -t -s',' # Pretty print
error_patterns:
- pattern: "sed: -e expression #1"
cause: "Invalid sed expression syntax"
fix: "Check delimiter escaping and regex syntax"
- pattern: "awk: cmd. line"
cause: "Awk syntax error"
fix: "Verify field references and brace matching"
- pattern: "grep: Invalid regular expression"
cause: "Malformed regex pattern"
fix: "Use -E for extended regex or escape special chars"
fallback_strategy:
- level: 1
action: "Suggest simpler pattern"
- level: 2
action: "Break into multiple commands"
performance_rules:
- rule: "Use ripgrep (rg) for large codebases"
reason: "10x faster, respects .gitignore"
- rule: "Use LC_ALL=C for byte-level operations"
reason: "Avoids locale overhead, 5-10x faster"
- rule: "Avoid cat | grep, use grep file directly"
reason: "Removes unnecessary pipe overhead"
grep -E 'pattern' firstsed -n 'p' to debug sed scripts{print "DEBUG:", $0} in awkfile command)Pattern not matching?
├── Check: case sensitivity (-i flag)
├── Verify: BRE vs ERE vs PCRE syntax
└── Debug: escape special characters
Sed not replacing?
├── Check: delimiter conflicts (use # or |)
├── Verify: capture group syntax \( \) vs ( )
└── Test: remove -i flag first
| Task | grep | sed | awk | Best Choice |
|---|---|---|---|---|
| Simple search | ✓ | - | - | grep |
| Replace in file | - | ✓ | ✓ | sed |
| Column extraction | - | - | ✓ | awk |
| Math operations | - | - | ✓ | awk |
Use this agent to verify that a Python Agent SDK application is properly configured, follows SDK best practices and documentation recommendations, and is ready for deployment or testing. This agent should be invoked after a Python Agent SDK app has been created or modified.
Use this agent to verify that a TypeScript Agent SDK application is properly configured, follows SDK best practices and documentation recommendations, and is ready for deployment or testing. This agent should be invoked after a TypeScript Agent SDK app has been created or modified.