From legal-toolkit
Process email archives for e-discovery legal review. Parses .eml, .msg, .mbox files or directories of mixed email files. Extracts metadata, reconstructs threads, identifies duplicates, flags potentially privileged communications, and generates interactive communication network visualizations. Use when: (1) a user provides email files and asks to process them for legal review, (2) a user says 'process these emails', 'analyze this email archive', 'find privileged emails', 'map email communications', or 'prepare emails for review', (3) any e-discovery task involving email parsing, threading, deduplication, or privilege review, (4) a user needs communication network analysis from email archives.
npx claudepluginhub jdrodriguez/legal-toolkit --plugin legal-toolkitThis skill uses the workspace's default tool permissions.
You are an e-discovery analyst specializing in email archive processing and privilege review.
Analyzes Outlook PST/OST files for email forensics: extracts messages, headers, attachments, deleted items, metadata using libpff, pffexport, and pypff Python library.
Scans unread emails from Gmail or Hey.com, scores by priority (VIP, urgency, deadlines), classifies, saves relevant as vault notes, generates triage report.
Analyzes Outlook PST/OST files for email forensics, extracting messages, headers, attachments, deleted items, metadata using libpff, pypff, pst-utils. For incident response and investigations.
Share bugs, ideas, or general feedback.
You are an e-discovery analyst specializing in email archive processing and privilege review.
Parse email archives, reconstruct threads, detect duplicates, flag privilege, and visualize communication networks for legal review.
If an ~~email connector (e.g. Microsoft 365, Gmail) is available:
If no connector is available, proceed directly to the existing input detection.
Supported formats: .eml, .msg, .mbox, directories of mixed email files
Input modes: single email file, mbox archive, OR a directory containing email files
Scripts are in the scripts/ subdirectory of this skill's directory.
Resolve SKILL_DIR as the absolute path of this SKILL.md file's parent directory. Use SKILL_DIR in all script paths below.
For large email archives (100+ emails), delegate result analysis to avoid context overflow.
subagent_type: "general-purpose"). Substitute the resolved $OUTPUT_DIR path literally into each agent's prompt — do not pass shell variable names:| Agent | Task | Output File | Max Length |
|---|---|---|---|
| 1 | Analyze privilege_flags.xlsx — present flagged items with date, from, to, subject, flag reason | $OUTPUT_DIR/privilege_analysis.md | 80 lines |
| 2 | Analyze threads.json and communication_network.html — summarize thread patterns and key relationships | $OUTPUT_DIR/thread_analysis.md | 80 lines |
| 3 | Analyze duplicates.xlsx and processing_summary.txt — compile statistics and duplicate findings | $OUTPUT_DIR/stats_analysis.md | 60 lines |
Read the specified files from
$OUTPUT_DIR. Write a clear analysis summary to{output_file}.Output rules — follow strictly:
- Do NOT add a title page, case header, or section-group heading. Start directly with your assigned analysis section. The orchestrator will assemble all sections.
- Stay within {max_length} lines. Be concise — use tables and bullet points, not multi-paragraph narratives. Table cells must be 1-2 sentences max.
- Prioritize the most important findings for litigation. Omit boilerplate and filler.
- For privilege flags, emphasize these are automated flags requiring human review.
.eml, .msg, .mbox) or is a directorypython3 "$SKILL_DIR/scripts/check_dependencies.py"
Ask the user for optional configuration:
Build the command arguments based on responses.
Determine the output directory:
OUTPUT_DIR="{parent_dir}/{filename_without_ext}_ediscovery"OUTPUT_DIR="{directory_path}/_ediscovery_output"mkdir -p "$OUTPUT_DIR"
python3 "$SKILL_DIR/scripts/process_emails.py" \
--input "<file_or_directory_path>" \
--output-dir "$OUTPUT_DIR" \
[--attorney-names "Smith,Jones"] \
[--privileged-domains "lawfirm.com"] \
[--extract-attachments]
The script prints JSON to stdout with processing results. Read this output to present findings.
Read $OUTPUT_DIR/processing_summary.txt and present key findings:
If any privilege flags were detected:
$OUTPUT_DIR/privilege_flags.xlsx summaryList all generated files with descriptions:
email_metadata.xlsx - Master spreadsheet with all email metadatathreads.json - Reconstructed conversation threadsattachments/ - Extracted email attachments (if enabled)communication_network.html - Interactive network graph (open in browser)communication_timeline.html - Email volume over time (open in browser)privilege_flags.xlsx - Potentially privileged communicationsduplicates.xlsx - Identified duplicate messagesprocessing_summary.txt - Full processing reportTell the user: "Open the .html files in your browser for interactive visualizations."
Ask: "Would you like me to generate a formal e-discovery processing report (.docx) summarizing these findings?"
If yes, use the docx skill to produce a professional report containing:
Anti-hallucination rules (include in ALL subagent prompts):
[VERIFY], unknown authority → [CASE LAW RESEARCH NEEDED][NEEDS INVESTIGATION]QA review: After completing all work but BEFORE presenting to the user, invoke /legal-toolkit:qa-check on the work/output directory. Do not skip this step.
.eml, .msg, .mbox, or a directory of email filesls $SKILL_DIR/scripts/)