From antigravity-awesome-skills
Builds low-latency Iron Man-inspired F.R.I.D.A.Y. tactical voice assistant using Pipecat, Gemini LLM, OpenAI STT/TTS on local hardware. For real-time conversational voice agents.
npx claudepluginhub sickn33/antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
This skill provides a blueprint for building **F.R.I.D.A.Y.** (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the **Pipecat** framework to orchestrate a low-latency pipeline:
Builds low-latency Iron Man-inspired F.R.I.D.A.Y. tactical voice assistant using Pipecat, Gemini LLM, OpenAI STT/TTS on local hardware. For real-time conversational voice agents.
Builds and extends FRIDAY, Tony Stark-inspired voice AI assistant using FastMCP tool server over SSE, LiveKit Agents for real-time STT-Gemini-OpenAI TTS pipeline.
Architects production voice agents using speech-to-speech (OpenAI Realtime API) or STT→LLM→TTS pipelines, targeting <800ms latency with VAD and interruption handling.
Share bugs, ideas, or general feedback.
This skill provides a blueprint for building F.R.I.D.A.Y. (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the Pipecat framework to orchestrate a low-latency pipeline:
whisper-1) or gpt-4o-transcribenova voice)You will need the Pipecat framework and its service providers installed:
pip install pipecat-ai[openai,google,silero] python-dotenv
Create a .env file with your API keys:
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
Execute the provided Python script to start the interface:
python scripts/friday_agent.py
The agent follows a linear pipeline: Mic -> VAD -> STT -> LLM -> TTS -> Speaker. This allows for granular control over each stage, unlike end-to-end speech-to-speech models.
Since Google's Gemini API has a different message format than OpenAI's standard (which Pipecat aggregators expect), the script includes a GoogleSafeContext and GoogleSafeMessage class to bridge the gap.
audio_out_sample_rate matches to avoid high-pitched or slowed audio.OUTPUT_DEVICE index. Run a script like test_audio_output.py to find the correct hardware index for your OS.GoogleSafeContext shim is correctly translating OpenAI-style dicts to Gemini-style schema.@voice-agents - General principles of voice AI.@agent-tool-builder - Add tools (Search, Lights, etc.) to your Friday agent.@llm-architect - Optimizing the LLM layer.