Local text-to-speech using Kokoro TTS
Converts text to speech locally using Kokoro-82M model. Use when you need offline TTS without API keys, especially for generating audiobooks or long documents.
/plugin marketplace add dnvriend/kokoro-tts-tool/plugin install kokoro-tts-tool@kokoro-tts-toolThis skill inherits all available tools. When active, it can use any tool Claude has access to.
This skill provides access to the kokoro-tts-tool CLI for local text-to-speech synthesis using the Kokoro-82M model. Runs entirely on-device with ONNX runtime, optimized for Apple Silicon.
Use this skill when:
Do NOT use this skill for:
Local text-to-speech CLI using Kokoro-82M (82 million parameters).
# Clone and install
git clone https://github.com/dnvriend/kokoro-tts-tool.git
cd kokoro-tts-tool
uv tool install .
# Initialize (downloads ~350MB models)
kokoro-tts-tool init
# Synthesize text to speakers
kokoro-tts-tool synthesize "Hello world"
# Save to file
kokoro-tts-tool synthesize "Hello" --output speech.wav
# Stream a document
kokoro-tts-tool infinite --input book.md
Downloads the Kokoro ONNX model (~300MB) and voice embeddings (~50MB).
Usage:
kokoro-tts-tool init [OPTIONS]
Options:
--force, -f: Re-download models even if they existExamples:
# Download models (skips if already present)
kokoro-tts-tool init
# Force re-download
kokoro-tts-tool init --force
Synthesizes text using the Kokoro TTS model. Audio can be played through speakers or saved to file.
Usage:
kokoro-tts-tool synthesize [TEXT] [OPTIONS]
Arguments:
TEXT: Text to synthesize (optional if using --stdin)Options:
--stdin, -s: Read text from stdin--voice, -v VALUE: Voice ID (default: af_heart)--output, -o PATH: Save to WAV file--speed FLOAT: Speech speed 0.5-2.0 (default: 1.0)--silence INT: Trailing silence in ms (default: 200)Examples:
# Play text with default voice
kokoro-tts-tool synthesize "Hello world"
# Use different voice
kokoro-tts-tool synthesize "Hello" --voice am_adam
# Save to file
kokoro-tts-tool synthesize "Hello" --output speech.wav
# Read from stdin
echo "Hello world" | kokoro-tts-tool synthesize --stdin
# Adjust speed
kokoro-tts-tool synthesize "Hello" --speed 1.5
# Multiple options
cat article.txt | kokoro-tts-tool synthesize --stdin \
--voice bf_emma \
--output article.wav \
--speed 0.9
Output: Audio played through speakers (default) or saved as WAV file (24kHz, mono, 16-bit).
Reads markdown or plain text, splits intelligently into chunks, and streams to speakers or renders to file.
Usage:
kokoro-tts-tool infinite [OPTIONS]
Options:
--input, -i PATH: Input text/markdown file--stdin, -s: Read text from stdin--output, -o PATH: Save to WAV file (fast offline mode)--voice VALUE: Voice ID (default: af_heart)--speed FLOAT: Speech speed 0.5-2.0 (default: 1.0)--chunk-size INT: Target words per chunk 50-1000 (default: 200)--pause INT: Pause between chunks in ms 0-2000 (default: 150)--no-markdown: Treat input as plain textExamples:
# Stream to speakers
kokoro-tts-tool infinite --input book.md
# Render to WAV (fast, ~2-3min for 1hr audio)
kokoro-tts-tool infinite --input book.md --output audiobook.wav
# Pipe from stdin
cat chapter.md | kokoro-tts-tool infinite --stdin
# With custom voice and speed
kokoro-tts-tool infinite --input notes.md \
--voice am_adam \
--speed 1.2
# Render audiobook with narrator voice
kokoro-tts-tool infinite --input book.md \
--output book.wav \
--voice bm_george \
--speed 0.95
# Shorter chunks for studying
kokoro-tts-tool infinite --input study.md \
--chunk-size 200 \
--pause 600
Output:
Lists voice information including ID, name, gender, accent, quality grade, and description.
Usage:
kokoro-tts-tool list-voices [OPTIONS]
Options:
--language, -l VALUE: Filter by language (English, Japanese, etc.)--gender, -g VALUE: Filter by gender (Male, Female)--json: Output as JSON for scriptingExamples:
# List all voices
kokoro-tts-tool list-voices
# Filter by language
kokoro-tts-tool list-voices --language English
# Filter by gender
kokoro-tts-tool list-voices --gender Female
# Combined filters
kokoro-tts-tool list-voices --language English --gender Male
# JSON output for scripting
kokoro-tts-tool list-voices --json
Voice ID Format:
[language][gender]_[name]Quality Grades:
Shows information about the Kokoro TTS installation.
Usage:
kokoro-tts-tool info
Examples:
kokoro-tts-tool info
Output:
Generate shell completion scripts for bash, zsh, or fish.
Usage:
kokoro-tts-tool completion SHELL
Arguments:
SHELL: Shell type (bash, zsh, fish)Examples:
# Bash (add to ~/.bashrc)
eval "$(kokoro-tts-tool completion bash)"
# Zsh (add to ~/.zshrc)
eval "$(kokoro-tts-tool completion zsh)"
# Fish
kokoro-tts-tool completion fish > ~/.config/fish/completions/kokoro-tts-tool.fish
</details>
<details>
<summary><strong>⚙️ Advanced Features (Click to expand)</strong></summary>
Control logging detail with progressive verbosity levels. All logs output to stderr.
Logging Levels:
| Flag | Level | Output | Use Case |
|---|---|---|---|
| (none) | WARNING | Errors and warnings only | Production, quiet mode |
-v | INFO | + High-level operations | Normal debugging |
-vv | DEBUG | + Detailed info, full tracebacks | Development |
-vvv | TRACE | + Library internals | Deep debugging |
Examples:
# INFO level
kokoro-tts-tool -v synthesize "Hello"
# DEBUG level
kokoro-tts-tool -vv infinite --input book.md
# TRACE level
kokoro-tts-tool -vvv synthesize "Hello"
Compose commands with Unix pipes for workflows.
Examples:
# Get voice IDs as JSON and filter
kokoro-tts-tool list-voices --json | jq '.[].id'
# Read from another command
cat document.md | kokoro-tts-tool infinite --stdin
# Chain with file processing
find . -name "*.md" -exec cat {} \; | kokoro-tts-tool infinite --stdin
</details>
<details>
<summary><strong>🔧 Troubleshooting (Click to expand)</strong></summary>
Issue: Command not found
# Verify installation
kokoro-tts-tool --version
# Reinstall if needed
cd kokoro-tts-tool
uv tool install . --reinstall
Issue: Models not downloaded
# Initialize models
kokoro-tts-tool init
# Force re-download
kokoro-tts-tool init --force
Issue: Audio not playing
--output test.wav-vvIssue: Voice not found
# List available voices
kokoro-tts-tool list-voices
# Check voice ID format
kokoro-tts-tool list-voices --json | jq '.[].id'
# General help
kokoro-tts-tool --help
# Command-specific help
kokoro-tts-tool synthesize --help
kokoro-tts-tool infinite --help
</details>
0: Success1: Error (validation, runtime, or unexpected)Default Output:
File Output (--output):
JSON Output (--json on list-voices):
kokoro-tts-tool init before synthesis--output for consistent results