Turn any website into an audio-enabled experience. Covers TTS reading mode (SpeechSynthesis API), pre-recorded MP3 audio player, and Voice CRO trigger system. Zero dependencies, works on any static or dynamic site. Use when adding read-aloud, audio player, or voice-based conversion features.
From cmnpx claudepluginhub tody-agent/codymaster --plugin cmThis skill is limited to using the following tools:
audio-player.mdexamples/blog-reader.jsexamples/voice-cro.jstts-engine.mdui-patterns.mdvoice-cro.mdProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
Implements distributed tracing with Jaeger/Tempo for microservices, including Kubernetes/Docker setup and OpenTelemetry instrumentation (Python/Flask). Use for debugging latency, dependencies, and request flows.
Philosophy: Reading is passive. Listening is intimate. Voice builds trust faster than any headline. Core Principle: Zero dependencies. Progressive enhancement. Respect user's device and preferences.
| File | Status | When to Read |
|---|---|---|
| tts-engine.md | π΄ REQUIRED | Adding TTS / read-aloud to any page |
| audio-player.md | βͺ Optional | Pre-recorded MP3 playback |
| voice-cro.md | βͺ Optional | Trigger-based voice sales / CRO |
| ui-patterns.md | βͺ Optional | Player bar & bottom sheet design |
π΄ tts-engine.md = ALWAYS READ when implementing TTS. Others = only if relevant.
"I need audio on my website"
β
ββ Read article content aloud (text-to-speech)
β ββ Use: TTS Engine β tts-engine.md
β ββ Blog / article pages β Content Reader pattern
β ββ Documentation β Section Reader pattern
β ββ E-commerce β Product Description Reader pattern
β
ββ Play pre-recorded audio files (MP3/WAV)
β ββ Use: Audio Player β audio-player.md
β ββ Podcasts / interviews β Playlist pattern
β ββ Sales pitch / welcome β Triggered playback
β ββ Background ambient β Loop pattern
β
ββ Voice-based conversion optimization (CRO)
β ββ Use: Voice CRO β voice-cro.md
β ββ Landing pages β Trigger-based bottom sheet
β ββ Service pages β Per-page audio scripts
β ββ Course pages β Social proof audio
β
ββ Combination (TTS + CRO)
ββ Read tts-engine.md + voice-cro.md
ββ Ensure no conflict (TTS reader vs CRO player)
| Engine | API | Source | Best For |
|---|---|---|---|
| TTS Reader | SpeechSynthesis | Page text content | Blogs, articles, docs |
| Audio Player | HTMLAudioElement | Pre-recorded MP3 | Sales, podcasts, guides |
| Voice CRO | Audio + triggers | MP3 + behavior detection | Landing pages, sales |
Feature detection β Graceful degradation β Never break the page
if (!('speechSynthesis' in window)) return; // TTS
if (!window.Audio) return; // Audio
Rule: Audio features are ENHANCEMENTS. The page must function 100% without them.
Clone β Strip β Clean β Split β Speak
DON'T read the raw DOM.
DO clone, remove noise, extract clean text.
Strip list (always remove before speaking):
Browsers have a hard limit on utterance length (~3000-5000 chars depending on browser/OS). Long text must be split into chunks.
Split Strategy:
ββ Split on sentence boundaries (. ! ? \n)
ββ Max chunk: 2500 chars (safe across all browsers)
ββ Preserve sentence integrity (never split mid-sentence)
ββ Chain chunks via onend callback
Language voices:
1. Local service voice (faster, works offline)
2. Network voice (higher quality, needs internet)
3. Any voice matching language prefix
4. null (browser default)
β οΈ CRITICAL: Chrome silently stops SpeechSynthesis after ~15 seconds of continuous speech. This is the #1 gotcha.
// Workaround: pause/resume every 10s
setInterval(() => {
if (synth.speaking && !synth.paused) {
synth.pause();
synth.resume();
}
}, 10000);
β οΈ GOTCHA: Calling
synth.cancel()fires theonerrorevent on any active utterance with error type'canceled'or'interrupted'.
Solution: Use a guard flag or check error type:
u.onerror = function(e) {
if (e.error === 'canceled' || e.error === 'interrupted') return;
stopReading();
};
βββββββββββββββββββββββββββββββββββββββββββ
β IIFE β
β β
β ββ Feature Detection ββ β
β β speechSynthesis? β β
β ββββββββββββ¬ββββββββββββ β
β βΌ β
β ββ Content Extraction ββ β
β β Clone β Strip β Cleanβ β
β ββββββββββββ¬βββββββββββββ β
β βΌ β
β ββ Chunking Engine βββββ β
β β Split on sentences β β
β β Max 2500 chars β β
β ββββββββββββ¬βββββββββββββ β
β βΌ β
β ββ Utterance Builder βββ β
β β Set voice/rate/pitch β β
β β Chain via onend β β
β ββββββββββββ¬βββββββββββββ β
β βΌ β
β ββ Player UI βββββββββββ β
β β Bar: play/pause/stop β β
β β Progress indicator β β
β β Trigger button β β
β ββββββββββββ¬βββββββββββββ β
β βΌ β
β ββ Keep-Alive Timer ββββ β
β β pause/resume @ 10s β β
β βββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββ
Init β Detect β Inject Trigger Button
β
User clicks βΆ
β
Extract Text β Chunk β Build Utterances
β
synth.speak(chunk[0])
β
chunk[0].onend β speak(chunk[1]) β ... β speak(chunk[N])
β β
Keep-Alive Timer running chunk[N].onend
β β
User clicks βΈ β synth.pause() stopReading()
User clicks βΆ β synth.resume() cleanup UI
User clicks β β synth.cancel()
speechSynthesis in window)onerror guard (handle cancel/interrupted)beforeunload cleanupprefers-reduced-motion respectnone β load on demand)currentTime/duration| Pitfall | Symptom | Fix |
|---|---|---|
| Chrome stops after 15s | Audio cuts mid-sentence | Keep-alive timer (pause/resume) |
synth.cancel() fires onerror | Settings sheet closes immediately | Guard flag or check error type |
| Voices not loaded | No voice available | Listen for voiceschanged event |
| Chunk too large | Utterance fails silently | Max 2500 chars per chunk |
| Reading CTA text | TTS reads "Book Now" button text | Strip non-content elements |
| Autoplay blocked | Audio won't start on mobile | Require user interaction first |
| Multiple audio conflicts | TTS + CRO play simultaneously | Mutual exclusion check |
| No cleanup on nav | Audio keeps playing | beforeunload β synth.cancel() |
Voice selection by language:
ββ Vietnamese: v.lang === 'vi-VN' || v.lang.startsWith('vi')
ββ English: v.lang === 'en-US' || v.lang.startsWith('en')
ββ Japanese: v.lang === 'ja-JP' || v.lang.startsWith('ja')
ββ Korean: v.lang === 'ko-KR' || v.lang.startsWith('ko')
ββ Any: Pass language code as config parameter
Set utterance.lang to match the content language for correct pronunciation.
| File | Content |
|---|---|
| tts-engine.md | Complete SpeechSynthesis API reference, chunking strategies, voice selection |
| audio-player.md | HTMLAudioElement patterns, preload strategies, error handling |
| voice-cro.md | Trigger system, bottom sheet patterns, CRO analytics |
| ui-patterns.md | Player bar CSS, bottom sheet CSS, animations, responsive design |
| File | Description |
|---|---|
| examples/blog-reader.js | Complete TTS reader β Substack-style, 350 LOC |
| examples/voice-cro.js | Complete Voice CRO trigger system β 390 LOC |
Remember: Voice is the most personal interface. A well-placed audio feature can increase engagement 3-5x. But unwanted audio is the fastest way to lose a user. Always require user initiation. Never autoplay.