Slash Command

/init-project

Initialize Docling extraction project structure with directories, config, and scripts

npx claudepluginhub orbruno/docling-ccplugin

Popularity

Stars

Forks

Invocation

How this command is triggered — by the user, by Claude, or both

Slash command

/docling-toolkit:init-project [project-name] [--path /custom/path]

Model invocable

No pre-commands

Tool Access

This command is limited to the following tools:

ReadWriteBashAskUserQuestion

Context Preview

The summary Claude sees in its command listing — used to decide when to auto-load this command

# Initialize Docling Project

Create a complete project structure for Docling document extraction work.

## Task

Initialize a new project directory for document extraction using Docling. The user has requested: "$ARGUMENTS"

## Steps

1. **Parse arguments**:
   - Extract project name from arguments (first positional argument or default to "docling-extraction")
   - Check for `--path` flag to specify custom directory path
   - Validate project name (alphanumeric, hyphens, underscores only)

2. **Determine project location**:
   - If `--path` provided, use that directory
   - Otherwise, crea...

Command Content

230 lines · ~1.4k tokens

Other plugins with /init-project

/init

Initializes docs folder with minimal, standard, or full structure, creating directories and README files with navigation, placeholders, and project analysis. Supports --check preview and --force overwrite.

3 tools

docs-specialist

/init

449

Initializes or re-boots llmdoc/ directory structure, runs multi-themed project investigations with investigator, and generates initial stable docs via recorder.

llmdoc

/sc-index

/index

3.8k

claude-scholar

/init

Bootstraps project.intent.md and project.glossary.json from existing codebase via deterministic scan, LLM synthesis, and interactive editing. Supports --force, --harness, --project-root flags.

signum

/document-project

faos-analyst

Stats

LanguagePython

Stars1

Forks1

MaintenanceGood

Last CommitFeb 4, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Initialize Docling Project

Create a complete project structure for Docling document extraction work.

Task

Initialize a new project directory for document extraction using Docling. The user has requested: "$ARGUMENTS"

Steps

Parse arguments:
- Extract project name from arguments (first positional argument or default to "docling-extraction")
- Check for --path flag to specify custom directory path
- Validate project name (alphanumeric, hyphens, underscores only)
Determine project location:
- If --path provided, use that directory
- Otherwise, create in current directory: ./<project-name>/
- Check if directory already exists (warn user if it does)

Create directory structure: Use Bash tool to create folders:

mkdir -p <project-path>/{data/raw,data/processed,extracts,scripts,logs,config}

Directory structure:

<project-name>/
├── README.md
├── config/
│   └── docling-config.yaml
├── data/
│   ├── raw/              # Original PDFs/HTML
│   └── processed/        # Cleaned/organized
├── extracts/             # Docling JSONL output
├── scripts/              # Processing scripts
├── logs/                 # Processing logs
└── .env.example          # Environment variables template

Create README.md: Generate comprehensive README with:
- Project description
- Directory structure explanation
- Workflow steps:
  1. Place documents in data/raw/
  2. Run processing script
  3. Validate extracts
  4. Use extracts with downstream tools (BAML, etc.)
- Example commands
- Integration tips (BAML toolkit, vector databases)

Create config/docling-config.yaml: Generate configuration file with:

# Docling Configuration
chunker_type: hybrid  # or: hierarchical
export_mode: doc_chunks
use_granite_model: false  # Set true for scanned PDFs

# Metadata fields to extract
metadata_fields:
  - page_number
  - section_title
  - doc_items
  - origin

# Output configuration
output_format: jsonl
output_directory: ./extracts

# Processing options
batch_size: 10  # Process N documents at a time
parallel_workers: 4  # Number of parallel workers

Create .env.example: Generate environment template:

# Docling Cache Directory (optional)
# DOCLING_CACHE_DIR=$HOME/.cache/docling

# Logging Level
# DOCLING_LOG_LEVEL=INFO

# API Keys (if using BAML or other tools)
# GOOGLE_API_KEY=your-key-here
# ANTHROPIC_API_KEY=your-key-here

Generate placeholder scripts:
- Create scripts/.gitkeep to preserve directory
- Add comment in README suggesting: /docling-toolkit:scaffold-processor to generate scripts

Create .gitignore (optional):

# Extracted data
extracts/*.jsonl
logs/*.log

# Environment
.env

# Python
__pycache__/
*.py[cod]
.venv/

# OS
.DS_Store

Display success message:
- Show created directory structure (use tree or ls -R)
- Explain next steps:
  1. cd <project-name>
  2. Place documents in data/raw/
  3. Run /docling-toolkit:scaffold-processor to create processing script
  4. Process documents
  5. Validate extracts with /docling-toolkit:validate-extracts
Offer to create initial scripts (optional):
- Ask user: "Would you like me to generate a processing script now?"
- If yes, automatically run /docling-toolkit:scaffold-processor process_documents

README Template

# {Project Name}

Document extraction project using Docling.

## Directory Structure

- `data/raw/` - Place original PDFs and HTML files here
- `data/processed/` - Cleaned or organized documents
- `extracts/` - Docling output (JSONL format)
- `scripts/` - Processing scripts
- `logs/` - Processing logs
- `config/` - Configuration files

## Workflow

### 1. Add Documents

Place your PDF or HTML documents in `data/raw/`:

```bash
cp /path/to/documents/*.pdf data/raw/

2. Generate Processing Script

/docling-toolkit:scaffold-processor process_documents

3. Process Documents

uv run python scripts/process_documents.py \\
  --input-dir data/raw \\
  --output-file extracts/output.jsonl

4. Validate Extracts

/docling-toolkit:validate-extracts extracts/output.jsonl

5. Use Extracts Downstream

With BAML Toolkit

/baml-toolkit:batch-gemini GenerateProfile \\
  extracts/output.jsonl \\
  --output profiles.json

With Custom Processing

import json

with open("extracts/output.jsonl") as f:
    for line in f:
        extract = json.loads(line)
        # Process extract

Configuration

Edit config/docling-config.yaml to customize:

Chunking strategy (hybrid vs hierarchical)
Granite model for scanned PDFs
Metadata fields
Output format

Tips

For scanned PDFs: Use --granite flag
For large batches: Process in parallel (see scripts)
For debugging: Check logs/ directory


## Notes

- Create all directories even if empty (use `.gitkeep` files)
- Make README comprehensive but focused
- Configuration should have sensible defaults
- Structure should match Orlando's project organization preferences (documented in his context)

## Success Criteria

- All directories created
- README with complete workflow
- Configuration file with defaults
- User understands next steps
- Project is ready for document processing

/init-project

Popularity

Invocation

Tool Access

Context Preview

Command Content

Other plugins with /init-project

Help us improve

Help us improve

Find plugins for your project

/init-project

Popularity

Invocation

Tool Access

Context Preview

Command Content

Initialize Docling Project

Task

Steps

README Template

2. Generate Processing Script

3. Process Documents

4. Validate Extracts

5. Use Extracts Downstream

With BAML Toolkit

With Custom Processing

Configuration

Tips

Other plugins with /init-project

Help us improve

Initialize Docling Project

Task

Steps

README Template

2. Generate Processing Script

3. Process Documents

4. Validate Extracts

5. Use Extracts Downstream

With BAML Toolkit

With Custom Processing

Configuration

Tips