npx claudepluginhub jdrodriguez/legal-toolkit --plugin legal-toolkitWant just this skill?
Then install: npx claudepluginhub u/[userId]/[slug]
Extract named entities from legal documents and map relationships between them using NLP. Processes PDF, DOCX, TXT, and MD files or directories of documents. Uses spaCy for named entity recognition to identify people, organizations, dates, monetary amounts, jurisdictions, and legal references, then builds interactive relationship graphs showing how entities connect across documents. Use when: (1) a user provides legal documents and asks to identify entities or map relationships, (2) a user says 'find all entities', 'map relationships', 'who is mentioned in these documents', 'extract names and dates', or 'analyze entity connections', (3) any legal analysis task requiring entity extraction, relationship mapping, or cross-document entity tracking, (4) a user needs to understand which people, organizations, and dates appear across a set of legal documents.
This skill uses the workspace's default tool permissions.
scripts/check_dependencies.pyscripts/map_entities.pyEntity & Relationship Mapper
Extract named entities from legal documents and map relationships using spaCy NLP and network analysis.
Supported formats: .pdf, .docx, .txt, .md
Input modes: single file OR a directory containing multiple files
NLP model: en_core_web_sm (default). For better accuracy on complex legal text, upgrade to en_core_web_trf by running python3 -m spacy download en_core_web_trf and passing --model en_core_web_trf.
Skill Directory
Scripts are in the scripts/ subdirectory of this skill's directory.
Resolve SKILL_DIR as the absolute path of this SKILL.md file's parent directory. Use SKILL_DIR in all script paths below.
Process
Step 1: Validate Input
- Confirm the user provided a path. If not, ask: "Please provide the path to the document or folder you want to analyze for entities."
- Determine if the path is a file or directory:
- File: verify it exists and has a supported extension (
.pdf,.docx,.txt,.md) - Directory: verify it exists; the script will find all supported files inside it
- File: verify it exists and has a supported extension (
- If a directory, tell the user which files were found before proceeding.
Step 2: Check Dependencies
python3 "$SKILL_DIR/scripts/check_dependencies.py"
- Exit 0: all good. Exit 1: packages were installed (proceed). Exit 2: failed (report to user).
- Note: On first run, the spaCy model
en_core_web_sm(~12MB) will be downloaded automatically.
Step 3: Run Entity Extraction
Determine the output directory:
- Single file:
OUTPUT_DIR="{parent_dir}/{filename_without_ext}_entities" - Directory:
OUTPUT_DIR="{directory_path}/_entity_analysis"
mkdir -p "$OUTPUT_DIR"
python3 "$SKILL_DIR/scripts/map_entities.py" \
--input "<file_or_directory_path>" \
--output-dir "$OUTPUT_DIR" \
[--model en_core_web_sm] \
[--min-mentions 2]
The script prints JSON to stdout with entity extraction results. Read this output to present findings.
Step 4: Present Entity Summary
Read $OUTPUT_DIR/entity_summary.txt and present key findings:
- Total entities found with breakdown by type (PERSON, ORG, DATE, MONEY, GPE, LAW, etc.)
- Most frequent entities - top 10 across all types
- Key people - most mentioned PERSON entities
- Key organizations - most mentioned ORG entities
- Date range - earliest and latest DATE entities detected
- Financial mentions - summary of MONEY entities
Step 5: Highlight Key Relationships
From the relationship graph data:
- Most connected entities - entities with the highest centrality scores
- Entity clusters - groups of entities that frequently co-occur
- Cross-document entities - entities that appear in multiple documents (if multi-file)
Step 6: Direct User to Outputs
List all generated files with descriptions:
entity_database.xlsx- Complete entity database with all mentionsrelationship_graph.html- Interactive network graph (open in browser)cross_reference_matrix.xlsx- Which entities appear in which documentstimeline_dates.xlsx- All date entities with surrounding contextfinancial_mentions.xlsx- All monetary amounts with contextentity_summary.txt- Human-readable summaryentities.json- Structured entity data for programmatic use
Tell the user: "Open relationship_graph.html in your browser to explore the interactive entity network. Nodes are colored by entity type."
Step 7: Offer AI Analysis
Ask: "Would you like me to analyze these entity relationships and their potential legal significance?"
If yes:
- Read
entities.jsonfor the full entity data - Provide analysis of:
- Key parties and their apparent roles
- Significant dates and their potential relevance
- Financial amounts and patterns
- Organizational relationships
- Cross-document patterns suggesting connections between matters
- Offer to generate a formal entity analysis report (.docx) using the
docxskill
Error Handling
- Path not found: Ask user to verify the path
- Unsupported format: Supported types are
.pdf,.docx,.txt,.md - Empty directory: No supported files found -- ask user to check the folder
- Empty extraction: File may be scanned/image-only. Delegate OCR to a subagent: launch an Agent (
subagent_type: "general-purpose") with prompt: "Run/legal-toolkit:extract-texton{file_path}and write the extracted text to{parent_dir}/{filename}_ocr.txt." Re-run entity extraction on the OCR output. - spaCy model not found: Run
python3 -m spacy download en_core_web_sm - Low entity count: Document may not contain recognizable entities; suggest reviewing raw text
- Script not found: Verify the skill is installed (
ls $SKILL_DIR/scripts/)
Similar Skills
You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation.