šØ EXECUTION NOTICE FOR CLAUDE
When you invoke this command via SlashCommand, the system returns THESE INSTRUCTIONS below.
YOU are the executor. This is NOT an autonomous subprocess.
- ā
The phases below are YOUR execution checklist
- ā
YOU must run each phase immediately using tools (Bash, Read, Write, Edit, TodoWrite)
- ā
Complete ALL phases before considering this command done
- ā DON't wait for "the command to complete" - YOU complete it by executing the phases
- ā DON't treat this as status output - it IS your instruction set
Immediately after SlashCommand returns, start executing Phase 0, then Phase 1, etc.
See @CLAUDE.md section "SlashCommand Execution - YOU Are The Executor" for detailed explanation.
Available Skills
This commands has access to the following skills from the elevenlabs plugin:
- api-authentication: API authentication patterns, SDK installation scripts, environment variable management, and connection testing for ElevenLabs. Use when setting up ElevenLabs authentication, installing ElevenLabs SDK, configuring API keys, testing ElevenLabs connection, or when user mentions ElevenLabs authentication, xi-api-key, ELEVENLABS_API_KEY, or ElevenLabs setup.
- mcp-integration
- production-deployment: Production deployment patterns for ElevenLabs API including rate limiting, error handling, monitoring, and testing. Use when deploying to production, implementing rate limiting, setting up monitoring, handling errors, testing concurrency, or when user mentions production deployment, rate limits, error handling, monitoring, ElevenLabs production.
- stt-integration: ElevenLabs Speech-to-Text transcription workflows with Scribe v1 supporting 99 languages, speaker diarization, and Vercel AI SDK integration. Use when implementing audio transcription, building STT features, integrating speech-to-text, setting up Vercel AI SDK with ElevenLabs, or when user mentions transcription, STT, Scribe v1, audio-to-text, speaker diarization, or multi-language transcription.
- tts-integration
- vercel-ai-patterns
- voice-processing: Voice cloning workflows, voice library management, audio format conversion, and voice settings. Use when cloning voices, managing voice libraries, processing audio for voice creation, configuring voice settings, or when user mentions voice cloning, instant cloning, professional cloning, voice library, audio processing, voice settings, or ElevenLabs voices.
To use a skill:
!{skill skill-name}
Use skills when you need:
- Domain-specific templates and examples
- Validation scripts and automation
- Best practices and patterns
- Configuration generators
Skills provide pre-built resources to accelerate your work.
Security Requirements
CRITICAL: All generated files must follow security rules:
@docs/security/SECURITY-RULES.md
Key requirements:
- Never hardcode API keys or secrets
- Use placeholders:
your_service_key_here
- Protect
.env files with .gitignore
- Create
.env.example with placeholders only
- Document key acquisition for users
Arguments: $ARGUMENTS
Goal: Add comprehensive TTS capabilities to the project with support for multiple ElevenLabs voice models, streaming audio, voice selection, and audio playback controls.
Core Principles:
- Detect framework and adapt implementation (Next.js, React, Python, Node.js)
- Support all 4 voice models (Eleven v3, Flash v2.5, Turbo v2.5, Multilingual v2)
- Implement both standard and streaming TTS
- Create reusable components/functions
- Include voice selection interface
Phase 1: Discovery
Goal: Understand project structure and existing setup
Actions:
- Check if ElevenLabs SDK is already installed:
- TypeScript: !{bash npm list @elevenlabs/elevenlabs-js 2>/dev/null}
- Python: !{bash pip show elevenlabs 2>/dev/null}
- Detect framework:
- Next.js: @package.json (check for "next")
- Python: @requirements.txt or @pyproject.toml
- React: @package.json (check for "react")
- Check if authentication is configured (@.env or @.env.local)
- Parse $ARGUMENTS for specific options (model preference, streaming, etc.)
Phase 2: Requirements Gathering
Goal: Clarify TTS implementation needs
Actions:
- If $ARGUMENTS doesn't specify preferences, use AskUserQuestion to ask:
- Which voice model to prioritize? (v3 Alpha for quality, Flash v2.5 for speed, Turbo v2.5 for balance, Multilingual v2 for stability)
- Do you need streaming audio support? (real-time vs complete audio)
- Should we include voice selection UI? (dropdown/list of available voices)
- Where should TTS functionality be added? (new page, existing component, API route, etc.)
Phase 3: Planning
Goal: Design the TTS implementation approach
Actions:
- Based on detected framework, plan:
- Component structure (React components, Python functions, API routes)
- File locations following project conventions
- Voice model configuration strategy
- Audio playback implementation
- Error handling approach
- Present plan to user for confirmation
Phase 4: Implementation
Goal: Build TTS integration with specialized agent
Actions:
Launch the elevenlabs-tts-integrator agent to implement text-to-speech capabilities.
Provide the agent with a detailed prompt including:
- Context: Detected framework, existing project structure, SDK installation status
- Target: $ARGUMENTS (any specific requirements)
- Requirements:
- Create TTS function/component with support for all 4 models:
- Eleven v3 Alpha (eleven_multilingual_v3) - highest quality, 70+ languages
- Eleven Flash v2.5 (eleven_flash_v2_5) - ultra-low latency ~75ms, 32 languages
- Eleven Turbo v2.5 (eleven_turbo_v2_5) - balanced speed/quality ~250ms
- Eleven Multilingual v2 (eleven_multilingual_v2) - stable, 29 languages
- Implement standard TTS (complete audio generation)
- Implement streaming TTS (real-time audio streaming) if requested
- Add voice selection interface (fetch from /v1/voices API)
- Create audio playback controls
- Include error handling and loading states
- Follow framework-specific patterns (React hooks, FastAPI routes, etc.)
- Add proper TypeScript types or Python type hints
- Use progressive documentation loading (fetch ElevenLabs TTS docs as needed)
- Expected output:
- TTS component/function created
- Voice selection UI (if requested)
- Audio playback implementation
- Example usage code
- Configuration for voice model selection
Phase 5: Verification
Goal: Ensure TTS implementation works correctly
Actions:
- Verify files were created in correct locations
- Check for TypeScript/Python errors:
- TypeScript: !{bash npx tsc --noEmit 2>/dev/null || echo "No TypeScript check available"}
- Python: !{bash python -m py_compile *.py 2>/dev/null || echo "No Python files to check"}
- Verify imports and dependencies
- Test that API key is properly referenced from environment
Phase 6: Summary
Goal: Guide user on using TTS features
Actions:
- Display implementation summary:
- Files created: [list of new files]
- Voice models available: [list of 4 models with descriptions]
- Features implemented: [standard TTS, streaming, voice selection, etc.]
- Provide usage instructions:
- How to convert text to speech
- How to select different voice models
- How to use streaming vs standard mode
- How to customize voice settings (stability, clarity, style)
- Show code example for detected framework
- Suggest next steps:
- Test with different voice models
- Explore voice cloning: /elevenlabs:add-voice-management
- Add Vercel AI SDK integration: /elevenlabs:add-vercel-ai-sdk
- Configure production features: /elevenlabs:add-production