By xberg-io
Extract text, tables, metadata, and images from 91+ document formats (PDF, Office, images, HTML, email, archives) with optional OCR for scanned documents, then chunk, embed, and keyword-enrich the output for RAG pipelines or LLM context windows.
Use when extracting from many files at once with shared config, bounded parallelism, per-file overrides, and error recovery. Covers the `batch` command, `--file-configs`, `--max-concurrent`, and output layout.
Use when splitting extracted text into chunks for LLM context windows or RAG ingestion. Covers chunk size, overlap, markdown/yaml/semantic chunkers, tokenizer-based sizing, and the standalone `chunk` command.
Use when extracting keywords (YAKE/RAKE) from documents — and, secondarily, when detecting document language or generating embeddings for RAG and search. Covers the keyword config (and its feature gating), `--detect-language`, and the standalone `embed` command with real flags.
Use when extracting tabular data from PDFs, spreadsheets, or images. Covers layout-aware table detection, table model selection, output formats (markdown / JSON cells), and known limits.
Use when extracting text from scanned PDFs, photographed pages, or images that have no embedded text layer. Covers OCR backends, language packs, force-OCR, and performance tuning.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Document-intelligence plugins for coding agents. Install any of the six into Claude Code, Codex CLI, Cursor, Gemini CLI, Factory Droid, GitHub Copilot CLI, or opencode.
| Plugin | Value Proposition | Status |
|---|---|---|
| kreuzberg | Local document extraction from 91+ formats (PDF, Office, images with OCR, HTML, email, archives, academic) | Stable — v0.2.2 |
| kreuzcrawl | Web crawling and scraping with HTML→Markdown and headless-Chrome fallback | Stable — v0.2.2 |
| xberg-enterprise | Managed extraction via api.xberg.io | Skills-only — MCP server in a later release |
| html-to-markdown | Fast, lossless HTML→Markdown with structured metadata and tables | Stable — v0.2.2 |
| liter-llm | Universal LLM API client for 143 providers (chat, streaming, tools, embeddings) | Stable — v0.2.2 |
| tree-sitter-language-pack | Parse and extract code intelligence from 300+ languages | Stable — v0.2.2 |
Once approved by the marketplace:
/plugin install kreuzberg@claude-community
/plugin install kreuzcrawl@claude-community
/plugin install xberg-enterprise@claude-community
/plugin install html-to-markdown@claude-community
/plugin install liter-llm@claude-community
/plugin install tree-sitter-language-pack@claude-community
Self-host (works today):
/plugin marketplace add xberg-io/plugins
/plugin install kreuzberg@kreuzberg
/plugin install kreuzcrawl@kreuzberg
/plugin install xberg-enterprise@kreuzberg
/plugin install html-to-markdown@kreuzberg
/plugin install liter-llm@kreuzberg
/plugin install tree-sitter-language-pack@kreuzberg
Pending review for official Claude marketplace.
Codex CLI marketplace is not yet open for third-party submissions. Use self-hosted install:
/plugins add https://github.com/xberg-io/plugins
Then search for the plugin you want — e.g. kreuzberg, kreuzcrawl, html-to-markdown, liter-llm, tree-sitter-language-pack, or xberg-enterprise — and select "Install Plugin".
Self-host install only:
Settings → Plugins → Add from URL → https://github.com/xberg-io/plugins. Select the plugin(s) you want.
Self-host install:
gemini extensions install https://github.com/xberg-io/plugins
Self-host install:
droid plugin marketplace add https://github.com/xberg-io/plugins
droid plugin install kreuzberg@kreuzberg
droid plugin install kreuzcrawl@kreuzberg
droid plugin install xberg-enterprise@kreuzberg
droid plugin install html-to-markdown@kreuzberg
droid plugin install liter-llm@kreuzberg
droid plugin install tree-sitter-language-pack@kreuzberg
Pending review for official Factory Droid marketplace.
Self-host install:
copilot plugin marketplace add https://github.com/xberg-io/plugins
copilot plugin install kreuzberg@kreuzberg
copilot plugin install kreuzcrawl@kreuzberg
copilot plugin install xberg-enterprise@kreuzberg
copilot plugin install html-to-markdown@kreuzberg
copilot plugin install liter-llm@kreuzberg
copilot plugin install tree-sitter-language-pack@kreuzberg
Add the published packages to opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"plugin": [
"@kreuzberg/opencode-kreuzberg",
"@kreuzberg/opencode-kreuzcrawl",
"@kreuzberg/opencode-html-to-markdown",
"@kreuzberg/opencode-tree-sitter-language-pack"
]
}
liter-llm and xberg-enterprise are not yet published as opencode packages.
Each plugin shells out to a real CLI. Install whichever you use:
npx claudepluginhub xberg-io/plugins --plugin kreuzbergFast, lossless HTML→Markdown conversion with structured metadata, tables, and document-structure extraction.
Universal LLM API client for 143 providers — chat, streaming, tools, embeddings, search, OCR, plus an OpenAI-compatible proxy and an MCP server.
Parse and extract code intelligence from 300+ programming languages with tree-sitter — structure, imports, symbols, and syntax-aware chunking.
Managed Kreuzberg document intelligence on api.xberg.io — async extraction with OCR, URL crawling, presigned uploads, document versioning and diffing, signed webhooks, and usage tracking.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
A growing collection of Claude-compatible academic workflow bundles. Covers scientific figures, manuscript writing and polishing, reviewer assessment, citation retrieval, data availability, paper reading, literature search, response letters, paper-to-PPTX conversion, and evidence-grounded Chinese invention patent drafting. Rules are organized as reusable skill folders with explicit workflows and quality checks.
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Persistent file-based planning for AI coding agents. Crash-proof markdown plans (task_plan.md, findings.md, progress.md) that survive context loss and /clear, with an opt-in completion gate and multi-agent shared state. Manus-style. Works with Claude Code, Codex CLI, Cursor, Kiro, OpenCode and 60+ agents via the SKILL.md standard. Includes Arabic, German, Spanish, and Chinese (Simplified and Traditional).
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Write SQL, explore datasets, and generate insights faster. Build visualizations and dashboards, and turn raw data into clear stories for stakeholders.