By dnvriend
A CLI tool that loads images, sends a prompt to ollama to invoke deepseek-ocr and instructs it to return the image as markdown
npx claudepluginhub dnvriend/ollama-deepseek-ocr-tool --plugin ollama-deepseek-ocr-tool
A CLI tool for batch OCR processing of document images using DeepSeek-OCR via Ollama.
Convert sequences of textbook pages, lecture slides, or scanned documents into a single, coherent markdown file suitable for note-taking applications like Obsidian.
Key Features:
# 1. Install Ollama
brew install ollama
# 2. Start Ollama service
ollama serve
# 3. Pull DeepSeek-OCR model (~6GB download)
ollama pull deepseek-ocr
cd ollama-deepseek-ocr-tool
uv sync
uv tool install .
# Basic: Process all PNG files in current directory
ollama-deepseek-ocr-tool "*.png" output.md
# Process textbook chapter from iPhone photos
ollama-deepseek-ocr-tool "IMG_*.png" chapter-3-notes.md
# Process lecture slides from subdirectory
ollama-deepseek-ocr-tool "lectures/week-5/*.jpg" week-5-summary.md
# Process numbered scans in order
ollama-deepseek-ocr-tool "scan-00*.png" document.md
# INFO level - High-level operations
ollama-deepseek-ocr-tool "*.png" output.md -v
# DEBUG level - Detailed processing info (file sizes, word counts)
ollama-deepseek-ocr-tool "*.png" output.md -vv
# TRACE level - Full HTTP request/response logs
ollama-deepseek-ocr-tool "*.png" output.md -vvv
# Show full help with examples and troubleshooting
ollama-deepseek-ocr-tool --help
<!-- Source: IMG_4170.png -->
[extracted text from page 1]
---
<!-- Source: IMG_4171.png -->
[extracted text from page 2]
# Install dependencies
make install
# Run linting
make lint
# Format code
make format
# Type check
make typecheck
# Security checks
make security
# Full pipeline
make pipeline
See ARCHITECTURE.md for detailed documentation on:
MIT
Built with assistance from AI coding tools and reviewed by humans.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.