Plugin

speak

Name: speak
Author: blacktop

Hear AI agent plans, resolved issues, and summaries announced aloud automatically via TTS after completion. Assign unique voices per project with fallbacks to Google, OpenAI, ElevenLabs, or system providers for hands-free auditory feedback during development workflows.

OpenAI

automation

developer-tools

npx claudepluginhub blacktop/mcp-tts --plugin speak

Component Overview

Skills

Component Details

Skills (1)

speak

/skill

Automatically announces plans, issues, and summaries out loud using TTS. Use this skill PROACTIVELY after completing major tasks like finalizing a plan, resolving an issue, or generating a summary. Each project gets a unique voice so users can identify which project is speaking from another room. Providers fallback in order (google, openai, elevenlabs, say) on rate limits.

README

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

Adds Text-to-Speech to things like Claude Desktop and Cursor IDE.

It registers four TTS tools:

say_tts
elevenlabs_tts
google_tts
openai_tts

`say_tts`

Uses the macOS say binary to speak the text with built-in system voices

`elevenlabs_tts`

Uses the ElevenLabs text-to-speech API to speak the text with premium AI voices

`google_tts`

Uses Google's Gemini TTS models to speak the text with 30 high-quality voices. Available voices include:

Achernar, Achird, Algenib, Algieba, Alnilam, Aoede, Autonoe, Callirrhoe, Charon, Despina, Enceladus, Erinome, Fenrir, Gacrux, Iapetus, Kore, Laomedeia, Leda, Orus, Puck, Pulcherrima, Rasalgethi, Sadachbia, Sadaltager, Schedar, Sulafat, Umbriel, Vindemiatrix, Zephyr, Zubenelgenubi

`openai_tts`

Uses OpenAI's Text-to-Speech API to speak the text with 10 natural-sounding voices:

alloy (Warm, conversational, modern)
ash (Confident, assertive, slightly textured)
ballad (Gentle, melodious, slightly lyrical)
coral (Cheerful, fresh, upbeat)
echo (Neutral, calm, balanced)
fable (Storyteller-like, expressive)
nova (Clear, precise, slightly formal)
onyx (Deep, authoritative, resonant)
sage (Soothing, empathetic, reassuring)
shimmer (Bright, animated, playful)
verse (Versatile, expressive)

Supports three quality models:

gpt-4o-mini-tts - Default, optimized quality and speed
tts-1 - Standard quality, faster generation
tts-1-hd - High definition audio, premium quality

Additional features:

Speed control from 0.25x to 4.0x (default: 1.0x)
Custom voice instructions (e.g., "Speak in a cheerful and positive tone") via parameter or OPENAI_TTS_INSTRUCTIONS environment variable

Configuration

Sequential vs Concurrent TTS

By default, the TTS server enforces sequential speech operations - only one TTS request can play audio at a time. This prevents multiple agents from speaking simultaneously and creating an unintelligible cacophony. Subsequent requests will wait in a queue until the current speech completes.

Multi-Instance Protection: The mutex works both within a single MCP server process and across multiple Claude Desktop instances. When running multiple Claude Desktop terminals, they coordinate via a system-wide file lock to prevent overlapping speech.

To allow concurrent TTS operations (multiple speeches playing simultaneously):

Environment Variable:

export MCP_TTS_ALLOW_CONCURRENT=true

Command Line Flag:

mcp-tts --sequential-tts=false

Note: Concurrent TTS may result in overlapping audio that's difficult to understand. Use this option only when you explicitly want multiple TTS operations to run simultaneously.

Suppressing "Speaking:" Output

By default, TTS tools return a message like "Speaking: [text]" when speech completes. This can interfere with LLM responses. To suppress this output:

Environment Variable:

export MCP_TTS_SUPPRESS_SPEAKING_OUTPUT=true

Command Line Flag:

mcp-tts --suppress-speaking-output

When enabled, tools return "Speech completed" instead of echoing the spoken text.

Saving Audio to Disk

Save TTS audio output to files instead of (or in addition to) playing them:

Environment Variables:

export MCP_TTS_OUTPUT_DIR=/path/to/audio    # Save audio files to this directory
export MCP_TTS_NO_PLAY=true                  # Skip playback, only save (optional)

Command Line Flags:

mcp-tts --output-dir /path/to/audio          # Save and play
mcp-tts --output-dir /path/to/audio --no-play  # Save only, no playback

Files are saved with unique names: tts_{timestamp}_{hash}.{ext}

View full README on GitHub

Similar Plugins

caveman

51.4k

1.9K

Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.

Stats

Version0.1.37

Stars50

Forks13

MaintenanceExcellent

Last CommitMar 7, 2026

AddedJan 28, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

mcp-tts54

Help us improve

Share bugs, ideas, or general feedback.

Back to Plugins

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

Adds Text-to-Speech to things like Claude Desktop and Cursor IDE.

It registers four TTS tools:

say_tts
elevenlabs_tts
google_tts
openai_tts

`say_tts`

Uses the macOS say binary to speak the text with built-in system voices

`elevenlabs_tts`

Uses the ElevenLabs text-to-speech API to speak the text with premium AI voices

`google_tts`

Uses Google's Gemini TTS models to speak the text with 30 high-quality voices. Available voices include:

`openai_tts`

Uses OpenAI's Text-to-Speech API to speak the text with 10 natural-sounding voices:

alloy (Warm, conversational, modern)
ash (Confident, assertive, slightly textured)
ballad (Gentle, melodious, slightly lyrical)
coral (Cheerful, fresh, upbeat)
echo (Neutral, calm, balanced)
fable (Storyteller-like, expressive)
nova (Clear, precise, slightly formal)
onyx (Deep, authoritative, resonant)
sage (Soothing, empathetic, reassuring)
shimmer (Bright, animated, playful)
verse (Versatile, expressive)

Supports three quality models:

gpt-4o-mini-tts - Default, optimized quality and speed
tts-1 - Standard quality, faster generation
tts-1-hd - High definition audio, premium quality

Additional features:

Speed control from 0.25x to 4.0x (default: 1.0x)
Custom voice instructions (e.g., "Speak in a cheerful and positive tone") via parameter or OPENAI_TTS_INSTRUCTIONS environment variable

Configuration

Sequential vs Concurrent TTS

To allow concurrent TTS operations (multiple speeches playing simultaneously):

Environment Variable:

export MCP_TTS_ALLOW_CONCURRENT=true

Command Line Flag:

mcp-tts --sequential-tts=false

Note: Concurrent TTS may result in overlapping audio that's difficult to understand. Use this option only when you explicitly want multiple TTS operations to run simultaneously.

Suppressing "Speaking:" Output

By default, TTS tools return a message like "Speaking: [text]" when speech completes. This can interfere with LLM responses. To suppress this output:

Environment Variable:

export MCP_TTS_SUPPRESS_SPEAKING_OUTPUT=true

Command Line Flag:

mcp-tts --suppress-speaking-output

When enabled, tools return "Speech completed" instead of echoing the spoken text.

Saving Audio to Disk

Save TTS audio output to files instead of (or in addition to) playing them:

Environment Variables:

export MCP_TTS_OUTPUT_DIR=/path/to/audio    # Save audio files to this directory
export MCP_TTS_NO_PLAY=true                  # Skip playback, only save (optional)

Command Line Flags:

mcp-tts --output-dir /path/to/audio          # Save and play
mcp-tts --output-dir /path/to/audio --no-play  # Save only, no playback

Files are saved with unique names: tts_{timestamp}_{hash}.{ext}

speak

Component Overview

Component Details

Skills (1)

README

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

say_tts

elevenlabs_tts

google_tts

openai_tts

Configuration

Sequential vs Concurrent TTS

Suppressing "Speaking:" Output

Saving Audio to Disk

Similar Plugins

caveman

Help us improve

speak

Component Overview

Component Details

Skills (1)

README

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

say_tts

elevenlabs_tts

google_tts

openai_tts

Configuration

Sequential vs Concurrent TTS

Suppressing "Speaking:" Output

Saving Audio to Disk

Similar Plugins

caveman

Help us improve

ui-design

frontend-design

anthropics-skills-13

humanizer

cache-components

`say_tts`

`elevenlabs_tts`

`google_tts`

`openai_tts`

`say_tts`

`elevenlabs_tts`

`google_tts`

`openai_tts`