AWS Polly TTS CLI for text-to-speech synthesis
Convert text to lifelike speech using AWS Polly's neural, generative, and long-form engines. Use when you need to synthesize audio, explore voices, or track TTS costs.
/plugin marketplace add dnvriend/aws-polly-tts-tool/plugin install aws-polly-tts-tool@aws-polly-tts-toolThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Professional AWS Polly text-to-speech CLI and library with agent-friendly design, enabling conversion of text to lifelike speech using Amazon Polly's deep learning technology. Supports 60+ voices in 30+ languages across four quality tiers with comprehensive cost tracking.
Use this skill when:
Do NOT use this skill for:
Professional AWS Polly TTS CLI and Python library designed with CLI-first philosophy for both command-line and programmatic use.
# Clone repository
git clone https://github.com/dnvriend/aws-polly-tts-tool.git
cd aws-polly-tts-tool
# Install with uv (Python 3.12)
uv tool install . --python 3.12
# Verify installation
aws-polly-tts-tool --version
polly:DescribeVoices, polly:SynthesizeSpeech, ce:GetCostAndUsage# Play text with default voice
aws-polly-tts-tool synthesize "Hello world"
# Save to file
aws-polly-tts-tool synthesize "Hello world" --output speech.mp3
# List available voices
aws-polly-tts-tool list-voices
# Show pricing
aws-polly-tts-tool pricing
Main TTS command with full feature support including multiple engines, voices, and output formats.
Usage:
aws-polly-tts-tool synthesize "TEXT" [OPTIONS]
Arguments:
TEXT: Text to synthesize (required, or use --stdin)--stdin / -s: Read text from stdin (enables piping)--voice TEXT: Voice ID (default: Joanna)--output PATH / -o PATH: Save audio to file instead of playing--format TEXT / -f TEXT: Output format (mp3, ogg_vorbis, pcm) - default: mp3--engine TEXT / -e TEXT: Voice engine (standard, neural, generative, long-form) - default: neural--ssml: Treat input as SSML markup--show-cost: Display character count and cost estimate--region TEXT / -r TEXT: AWS region override-V/-VV/-VVV: Verbosity (INFO/DEBUG/TRACE with AWS SDK details)Examples:
# Basic synthesis with default voice (Joanna, neural)
aws-polly-tts-tool synthesize "Hello world"
# Use different voice and engine
aws-polly-tts-tool synthesize "Hello" --voice Matthew --engine generative
# Save to file with specific format
aws-polly-tts-tool synthesize "Hello world" --output speech.mp3 --format mp3
# Read from stdin
echo "Hello world" | aws-polly-tts-tool synthesize --stdin
# Read from file
cat article.txt | aws-polly-tts-tool synthesize --stdin --output article.mp3
# Use SSML for advanced control
aws-polly-tts-tool synthesize '<speak>Hello <break time="500ms"/> world</speak>' --ssml
# Show cost estimate
aws-polly-tts-tool synthesize "Hello world" --show-cost
# Multiple options combined with debugging
cat article.txt | aws-polly-tts-tool synthesize --stdin \
--voice Joanna \
--engine neural \
--output article.mp3 \
--show-cost \
-VV
Output:
--show-cost)List and filter AWS Polly voices by engine, language, and gender.
Usage:
aws-polly-tts-tool list-voices [OPTIONS]
Options:
--engine TEXT / -e TEXT: Filter by engine (standard, neural, generative, long-form)--language TEXT / -l TEXT: Filter by language code (e.g., en-US, es-ES, fr-FR)--gender TEXT / -g TEXT: Filter by gender (Female, Male)--region TEXT / -r TEXT: AWS region override-V/-VV/-VVV: Verbosity levelsExamples:
# List all voices
aws-polly-tts-tool list-voices
# Filter by engine
aws-polly-tts-tool list-voices --engine neural
# Filter by language
aws-polly-tts-tool list-voices --language en-US
# Combine filters
aws-polly-tts-tool list-voices --engine neural --language en --gender Female
# Use with grep for searching
aws-polly-tts-tool list-voices | grep British
aws-polly-tts-tool list-voices --engine generative | grep Spanish
Output: Table with Voice, Gender, Language, Engines (supported), and Description columns. Dynamically fetched from Polly API (always up-to-date).
Show all available voice engines with technology, pricing, and best use cases.
Usage:
aws-polly-tts-tool list-engines
Examples:
# Show all engines with details
aws-polly-tts-tool list-engines
Output: Table showing:
Query AWS Cost Explorer for actual Polly usage costs with engine breakdown.
Usage:
aws-polly-tts-tool billing [OPTIONS]
Options:
--days INT / -d INT: Number of days to query (default: 30)--start-date TEXT: Custom start date (YYYY-MM-DD)--end-date TEXT: Custom end date (YYYY-MM-DD)--region TEXT / -r TEXT: AWS region for Cost Explorer-V/-VV/-VVV: Verbosity levelsExamples:
# Last 30 days of Polly costs
aws-polly-tts-tool billing
# Last 7 days
aws-polly-tts-tool billing --days 7
# Custom date range
aws-polly-tts-tool billing --start-date 2025-01-01 --end-date 2025-01-31
# With verbose output
aws-polly-tts-tool billing --days 7 -V
Output: Total cost and breakdown by engine (Standard, Neural, Generative, Long-form) in USD.
Note: Requires IAM permission ce:GetCostAndUsage
Display static pricing information for all Polly engines with cost examples.
Usage:
aws-polly-tts-tool pricing
Examples:
# Show pricing table and examples
aws-polly-tts-tool pricing
Output: Comprehensive pricing with:
Display AWS credentials status and tool configuration.
Usage:
aws-polly-tts-tool info
Examples:
# Verify AWS authentication and show config
aws-polly-tts-tool info
Output:
Generate shell completion scripts for bash, zsh, or fish.
Usage:
aws-polly-tts-tool completion [bash|zsh|fish]
Arguments:
SHELL: Shell type (bash, zsh, or fish) - requiredExamples:
# Generate bash completion
aws-polly-tts-tool completion bash
# Install for bash (add to ~/.bashrc)
eval "$(aws-polly-tts-tool completion bash)"
# Install for zsh (add to ~/.zshrc)
eval "$(aws-polly-tts-tool completion zsh)"
# Install for fish
aws-polly-tts-tool completion fish > ~/.config/fish/completions/aws-polly-tts-tool.fish
# File-based installation (recommended)
aws-polly-tts-tool completion bash > ~/.aws-polly-tts-tool-complete.bash
echo 'source ~/.aws-polly-tts-tool-complete.bash' >> ~/.bashrc
Output: Shell-specific completion script. After installation, restart shell or source config file.
</details> <details> <summary><strong>⚙️ Advanced Features (Click to expand)</strong></summary>Full SSML (Speech Synthesis Markup Language) support for advanced speech control.
Features:
Examples:
# Basic pause
aws-polly-tts-tool synthesize '<speak>Hello <break time="500ms"/> world</speak>' --ssml
# Prosody control (speed, pitch, volume)
aws-polly-tts-tool synthesize '<speak><prosody rate="slow" pitch="low">Deep voice</prosody></speak>' --ssml
# Emphasis
aws-polly-tts-tool synthesize '<speak>I <emphasis level="strong">really</emphasis> like this</speak>' --ssml
# Newscaster style (Matthew, Joanna only)
aws-polly-tts-tool synthesize '<speak><amazon:domain name="news">Breaking news today</amazon:domain></speak>' --ssml --voice Matthew
# Multiple prosody attributes
aws-polly-tts-tool synthesize '<speak><prosody rate="fast" pitch="high" volume="loud">Excited announcement!</prosody></speak>' --ssml
SSML Resources:
Progressive logging detail for debugging without code changes.
Levels:
-V (INFO): High-level operations (voice selection, file operations)-VV (DEBUG): Detailed steps (validation, API calls, character counts)-VVV (TRACE): Full AWS SDK internals (credentials, HTTP requests, boto3 events)Examples:
# Default: No verbose output
aws-polly-tts-tool synthesize "Hello world" --output test.mp3
# INFO level (-V)
aws-polly-tts-tool synthesize "Hello world" -V --output test.mp3
# [INFO] Using voice: Joanna (neural engine)
# [INFO] Synthesizing audio to file: test.mp3
# DEBUG level (-VV)
aws-polly-tts-tool synthesize "Hello world" -VV --output test.mp3
# [DEBUG] Validating engine: neural
# [DEBUG] Validating output format: mp3
# [DEBUG] Initializing AWS Polly client
# [INFO] Using voice: Joanna (neural engine)
# [DEBUG] Synthesized 11 characters
# TRACE level (-VVV) - Full AWS SDK details
aws-polly-tts-tool synthesize "Hello world" -VVV --output test.mp3
# [DEBUG] Looking for credentials via: env
# [INFO] Found credentials in shared credentials file: ~/.aws/credentials
# [DEBUG] Starting new HTTPS connection (1): polly.eu-central-1.amazonaws.com:443
# [DEBUG] https://polly.eu-central-1.amazonaws.com:443 "POST /v1/speech HTTP/1.1" 200
Note: All logs go to stderr, keeping stdout clean for data/piping.
Import and use as a Python library for programmatic access.
Basic Usage:
from aws_polly_tts_tool import (
get_polly_client,
synthesize_audio,
save_speech,
VoiceManager,
calculate_cost,
)
# Initialize client
client = get_polly_client(region="us-east-1")
# Synthesize audio
audio_bytes, char_count = synthesize_audio(
client=client,
text="Hello world",
voice_id="Joanna",
output_format="mp3",
engine="neural"
)
# Save to file
save_speech(
client=client,
text="Hello world",
voice_id="Joanna",
output_path=Path("output.mp3"),
engine="neural"
)
# List voices
voice_manager = VoiceManager(client)
voices = voice_manager.list_voices(engine="neural", language="en")
# Calculate cost
cost = calculate_cost(character_count=5000, engine="neural")
print(f"Estimated cost: ${cost:.4f}")
Public API:
get_polly_client(region=None) - Initialize boto3 Polly clientsynthesize_audio(client, text, voice_id, output_format, engine, text_type) - Synthesize audiosave_speech(client, text, voice_id, output_path, ...) - Save to fileplay_speech(client, text, voice_id, ...) - Play through speakersVoiceManager(client) - Voice discovery and managementcalculate_cost(char_count, engine) - Cost estimationStandard Engine ($4/1M chars)
Neural Engine ($16/1M chars)
Generative Engine ($30/1M chars)
Long-form Engine ($100/1M chars)
Decision Matrix:
Immediate Estimates:
# Use --show-cost for instant character count and cost
aws-polly-tts-tool synthesize "Text" --show-cost
Actual Billing:
# Query real AWS costs with Cost Explorer
aws-polly-tts-tool billing --days 30
Cost Optimization Tips:
billing command regularlyCost Examples:
Issue: No AWS credentials found
# Symptom
Error: Unable to locate credentials
Solution:
# Configure AWS credentials
aws configure
# Or set environment variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-east-1"
# Verify with
aws-polly-tts-tool info
Issue: Audio playback fails on Python 3.13+
# Symptom
Error: No module named 'audioop'
Solution: Option 1: Use Python 3.12 (recommended)
mise use python@3.12
uv tool install . --python 3.12
Option 2: Save to file instead (works on all Python versions)
aws-polly-tts-tool synthesize "Hello" --output speech.mp3
Issue: Voice not found
# Symptom
Error: Voice 'invalid' not found
Solution:
# List available voices
aws-polly-tts-tool list-voices
# Filter by engine
aws-polly-tts-tool list-voices --engine neural
# Case-sensitive voice names
aws-polly-tts-tool synthesize "Hello" --voice Joanna # Correct
Issue: Engine not supported by voice
# Symptom
Error: Voice doesn't support this engine
Solution:
# Check which engines a voice supports
aws-polly-tts-tool list-voices | grep "VoiceName"
# Not all voices support all engines
# Example: Standard voices don't support neural engine
Issue: Cost Explorer access denied
# Symptom
Error: AccessDeniedException when calling GetCostAndUsage
Solution:
Add IAM permission ce:GetCostAndUsage:
{
"Effect": "Allow",
"Action": ["ce:GetCostAndUsage"],
"Resource": "*"
}
Issue: Text too long for engine
# Symptom
Error: Text exceeds character limit
Solution:
# General help
aws-polly-tts-tool --help
# Command-specific help
aws-polly-tts-tool synthesize --help
aws-polly-tts-tool list-voices --help
# Show version
aws-polly-tts-tool --version
# Verify configuration
aws-polly-tts-tool info
Use progressive verbosity to diagnose issues:
# Basic debug info
aws-polly-tts-tool synthesize "Hello" -V
# Detailed debug info
aws-polly-tts-tool synthesize "Hello" -VV
# Full AWS SDK trace
aws-polly-tts-tool synthesize "Hello" -VVV
</details>
billing command to track actual spendinglist-voices to check engine compatibility before synthesis--output to save important audio for offline use-V/-VV/-VVV when debugging issues