CLI tool for OpenAI text-to-speech synthesis
Converts text to natural-sounding speech using OpenAI's TTS API. Use when you need to synthesize audio from text, list available voices/models, or verify API configuration.
/plugin marketplace add dnvriend/openai-tts-tool/plugin install openai-tts-tool@openai-tts-toolThis skill inherits all available tools. When active, it can use any tool Claude has access to.
A comprehensive CLI utility for text-to-speech synthesis using OpenAI's advanced TTS models. Supports multiple voices, languages, and output formats with professional-grade audio quality.
Use this skill when:
Do NOT use this skill for:
A modern Python CLI tool for accessing OpenAI's Text-to-Speech API with comprehensive features including multiple voice models, configurable output formats, and extensive customization options.
# Clone repository
git clone https://github.com/dnvriend/openai-tts-tool.git
cd openai-tts-tool
# Install with mise
mise use -g python@3.14
uv sync
uv tool install .
# Or run locally
uv run openai-tts-tool --help
mise for Python version management (recommended)uv package manager for modern Python dependency management# Basic text-to-speech conversion
openai-tts-tool synthesize "Hello, world!"
# Check API configuration
openai-tts-tool info
# List available voices
openai-tts-tool list-voices
The primary command for converting text input into high-quality audio files using OpenAI's TTS models.
Usage:
openai-tts-tool synthesize TEXT [OPTIONS]
Arguments:
TEXT: The text content to convert to speech (required). Can be a single word, sentence, or full paragraph.Options:
--voice VOICE / -V VOICE: Select voice model (default: alloy)
--model MODEL / -m MODEL: Choose TTS model (default: tts-1)
tts-1: Standard quality, lower latencytts-1-hd: Higher quality, slightly higher cost--output FILE / -o FILE: Output audio file path (default: output.mp3)
--speed SPEED / -s SPEED: Speech playback speed (default: 1.0)
Examples:
# Basic usage with default settings
openai-tts-tool synthesize "Welcome to our presentation"
# Use specific voice and output file
openai-tts-tool synthesize "Chapter 1: Introduction" \
--voice nova \
--output chapter1.mp3
# High-quality model with slower speech
openai-tts-tool synthesize "Important safety information" \
--model tts-1-hd \
--speed 0.8 \
--voice onyx
# Multiple sentences for narration
openai-tts-tool synthesize "In today's lecture, we will explore the fascinating world of artificial intelligence and its impact on modern society." \
--voice shimmer \
--output narration.mp3
# Batch processing (loop in shell)
for text in "Hello" "Goodbye" "Thank you"; do
openai-tts-tool synthesize "$text" --output "${text}.mp3"
done
Output: Generates an audio file in the specified format containing the synthesized speech. The file quality and characteristics depend on the selected model and voice options.
List all available OpenAI TTS voice models with their characteristics and language support.
Usage:
openai-tts-tool list-voices [OPTIONS]
Options:
--format FORMAT / -f FORMAT: Output display format
table: Human-readable table format (default)json: Machine-readable JSON format for scriptingExamples:
# Show voices in table format
openai-tts-tool list-voices
# Export voice information to JSON
openai-tts-tool list-voices --format json > voices.json
# Filter for English voices using jq
openai-tts-tool list-voices --format json | \
jq '.[] | select(.language | startswith("en"))'
# Get voice descriptions only
openai-tts-tool list-voices --format json | \
jq -r '.[] | "\(.name): \(.description)"'
Output: Returns a list of available voices including:
Show all available OpenAI TTS models with their specifications and capabilities.
Usage:
openai-tts-tool list-models [OPTIONS]
Options:
--format FORMAT / -f FORMAT: Output display format
table: Human-readable table format (default)json: Machine-readable JSON format for scriptingExamples:
# Show models in table format
openai-tts-tool list-models
# Export model information to JSON
openai-tts-tool list-models --format json > models.json
# Compare model capabilities
openai-tts-tool list-models --format json | \
jq '.[] | {name: .name, quality: .quality, latency: .latency}'
# Get HD model information only
openai-tts-tool list-models --format json | \
jq '.[] | select(.name | contains("hd"))'
Output: Returns available TTS models including:
Display current configuration, API key status, and system information.
Usage:
openai-tts-tool info [OPTIONS]
Options:
--format FORMAT / -f FORMAT: Output display format
table: Human-readable table format (default)json: Machine-readable JSON format for scriptingExamples:
# Show basic system information
openai-tts-tool info
# Check API key status for automation
openai-tts-tool info --format json | jq '.api_key.status'
# Verify configuration before batch processing
if openai-tts-tool info --format json | jq -e '.api_key.valid'; then
echo "API key valid, starting batch processing..."
else
echo "API key invalid, please check configuration"
fi
# Get version information
openai-tts-tool info --format json | jq '.version'
Output: Returns system configuration including:
Generate shell completion scripts to enable tab completion for bash, zsh, and fish shells.
Usage:
openai-tts-tool completion SHELL
Arguments:
SHELL: Target shell type (required)
bash: Generate bash completion scriptzsh: Generate zsh completion scriptfish: Generate fish completion scriptExamples:
# Generate bash completion and eval immediately
eval "$(openai-tts-tool completion bash)"
# Save completion to permanent file
openai-tts-tool completion zsh > ~/.oh-my-zsh/completions/_openai-tts-tool
# Install fish completion
openai-tts-tool completion fish > ~/.config/fish/completions/openai-tts-tool.fish
# Add to bashrc for permanent installation
echo 'eval "$(openai-tts-tool completion bash)"' >> ~/.bashrc
# Test completion generation
openai-tts-tool completion bash | head -10
Output: Returns a shell-specific completion script that provides:
The tool supports progressive verbosity levels for debugging and monitoring:
# Default (WARNING level) - quiet operation
openai-tts-tool synthesize "Hello"
# INFO level (-v) - high-level operations
openai-tts-tool -v synthesize "Hello"
# DEBUG level (-vv) - detailed debugging with line numbers
openai-tts-tool -vv synthesize "Hello"
# TRACE level (-vvv) - includes OpenAI and HTTP library internals
openai-tts-tool -vvv synthesize "Hello"
Configure the tool using environment variables:
# Set OpenAI API key
export OPENAI_API_KEY="sk-..."
# Set default output directory
export OPENAI_TTS_OUTPUT_DIR="./audio"
# Set default voice preference
export OPENAI_TTS_VOICE="nova"
# Enable verbose logging by default
export OPENAI_TTS_VERBOSE=2
Generate audio in multiple formats based on file extension:
# MP3 format (default, compressed)
openai-tts-tool synthesize "Hello" --output speech.mp3
# WAV format (uncompressed, high quality)
openai-tts-tool synthesize "Hello" --output speech.wav
# OGG format (open source, compressed)
openai-tts-tool synthesize "Hello" --output speech.ogg
# FLAC format (lossless compression)
openai-tts-tool synthesize "Hello" --output speech.flac
Different voices optimized for different use cases:
# alloy - balanced, neutral tone (default)
openai-tts-tool synthesize "Factual information" --voice alloy
# echo - conversational, friendly
openai-tts-tool synthesize "Welcome message" --voice echo
# fable - storytelling, expressive
openai-tts-tool synthesize "Once upon a time" --voice fable
# onyx - professional, deep voice
openai-tts-tool synthesize "Business presentation" --voice onyx
# nova - bright, energetic
openai-tts-tool synthesize "Exciting announcement" --voice nova
# shimmer - gentle, soothing
openai-tts-tool synthesize "Meditation guide" --voice shimmer
</details>
<details>
<summary><strong>🔧 Troubleshooting (Click to expand)</strong></summary>
Issue: Invalid OpenAI API key
# Symptom
Error: Invalid OpenAI API key
# Solution
export OPENAI_API_KEY="sk-your-valid-api-key-here"
openai-tts-tool info # Verify the key works
Issue: Audio file not created
# Symptom
Command completes but no output file
# Solution with verbose output
openai-tts-tool -vv synthesize "test" --output test.mp3
# Check directory permissions
ls -la "$(pwd)"
touch test_write.tmp && rm test_write.tmp
Issue: Network connectivity problems
# Symptom
Connection timeout or network errors
# Solution with trace logging
openai-tts-tool -vvv synthesize "test"
# Test basic connectivity
curl -I https://api.openai.com/v1/models
Issue: Voice not recognized
# Symptom
Error: Voice 'xyz' not supported
# Solution - list available voices
openai-tts-tool list-voices
# Use a valid voice
openai-tts-tool synthesize "test" --voice alloy
Issue: Audio quality poor
# Symptom
Audio sounds robotic or low quality
# Solution - use HD model
openai-tts-tool synthesize "test" --model tts-1-hd
# Adjust speed for natural speech
openai-tts-tool synthesize "test" --speed 0.9
# Show main help
openai-tts-tool --help
# Show command-specific help
openai-tts-tool synthesize --help
openai-tts-tool list-voices --help
# Check version and configuration
openai-tts-tool info --format json
# Test API connectivity
openai-tts-tool info --verbose
# Use tts-1 for faster processing (lower quality)
openai-tts-tool synthesize "quick test" --model tts-1
# Use tts-1-hd for better quality (slower)
openai-tts-tool synthesize "final version" --model tts-1-hd
# Batch process multiple texts efficiently
for file in *.txt; do
openai-tts-tool synthesize "$(cat "$file")" --output "${file%.txt}.mp3"
done
</details>
0: Success - operation completed successfully1: Client Error - invalid arguments, missing API key, file not found2: Server Error - OpenAI API server issues, rate limiting3: Network Error - connectivity problems, timeouts4: Configuration Error - invalid setup, permissions issuesAudio Formats:
Data Formats:
API Key Security: Never commit API keys to version control. Use environment variables or secure credential storage.
Voice Selection: Test different voices with your content type - some voices work better for specific content (e.g., onyx for professional content, fable for storytelling).
Quality vs Speed: Use tts-1 for testing/prototyping and tts-1-hd for final production audio.
File Organization: Use descriptive filenames and organize output files by project or content type.
Batch Processing: For multiple files, use shell scripting to process efficiently and handle errors.
Speed Optimization: Adjust speed between 0.8-1.2 for most natural speech; extreme speeds may sound artificial.
Text Preparation: Clean input text of special characters and formatting for best synthesis results.
This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.
Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.