From antigravity-awesome-skills
Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.
npx claudepluginhub absjaded/antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
This skill provides a blueprint for building **F.R.I.D.A.Y.** (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the **Pipecat** framework to orchestrate a low-latency pipeline:
Verifies tests pass on completed feature branch, presents options to merge locally, create GitHub PR, keep as-is or discard; executes choice and cleans up worktree.
Guides root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Writes implementation plans from specs for multi-step tasks, mapping files and breaking into TDD bite-sized steps before coding.
This skill provides a blueprint for building F.R.I.D.A.Y. (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the Pipecat framework to orchestrate a low-latency pipeline:
whisper-1) or gpt-4o-transcribenova voice)You will need the Pipecat framework and its service providers installed:
pip install pipecat-ai[openai,google,silero] python-dotenv
Create a .env file with your API keys:
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
Execute the provided Python script to start the interface:
python scripts/friday_agent.py
The agent follows a linear pipeline: Mic -> VAD -> STT -> LLM -> TTS -> Speaker. This allows for granular control over each stage, unlike end-to-end speech-to-speech models.
Since Google's Gemini API has a different message format than OpenAI's standard (which Pipecat aggregators expect), the script includes a GoogleSafeContext and GoogleSafeMessage class to bridge the gap.
audio_out_sample_rate matches to avoid high-pitched or slowed audio.OUTPUT_DEVICE index. Run a script like test_audio_output.py to find the correct hardware index for your OS.GoogleSafeContext shim is correctly translating OpenAI-style dicts to Gemini-style schema.@voice-agents - General principles of voice AI.@agent-tool-builder - Add tools (Search, Lights, etc.) to your Friday agent.@llm-architect - Optimizing the LLM layer.