Skill

pipecat-friday-agent

Builds low-latency Iron Man-inspired F.R.I.D.A.Y. tactical voice assistant using Pipecat, Gemini LLM, OpenAI STT/TTS on local hardware. For real-time conversational voice agents.

Python

OpenAI

ai-ml

npx claudepluginhub sickn33/antigravity-awesome-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

This skill provides a blueprint for building **F.R.I.D.A.Y.** (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the **Pipecat** framework to orchestrate a low-latency pipeline:

Supporting Assets

scripts/friday_agent.py

SKILL.md

Similar Skills

pipecat-friday-agent

32.8k

Builds low-latency Iron Man-inspired F.R.I.D.A.Y. tactical voice assistant using Pipecat, Gemini LLM, OpenAI STT/TTS on local hardware. For real-time conversational voice agents.

1 file

antigravity-awesome-skills

F.R.I.D.A.Y. — Tony Stark Voice Assistant

Builds and extends FRIDAY, Tony Stark-inspired voice AI assistant using FastMCP tool server over SSE, LiveKit Agents for real-time STT-Gemini-OpenAI TTS pipeline.

aradotso-trending-skills-37

Voice Agents

Architects production voice agents using speech-to-speech (OpenAI Realtime API) or STT→LLM→TTS pipelines, targeting <800ms latency with VAD and interruption handling.

3 files

omer-metin-skills-for-antigravity-2

Stats

Stars36383

Forks5961

Last CommitApr 13, 2026

Used By3 plugins

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Pipecat Friday Agent

Overview

This skill provides a blueprint for building F.R.I.D.A.Y. (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the Pipecat framework to orchestrate a low-latency pipeline:

STT: OpenAI Whisper (whisper-1) or gpt-4o-transcribe
LLM: Google Gemini 2.5 Flash (via a compatibility shim)
TTS: OpenAI TTS (nova voice)
Transport: Local Audio (Hardware Mic/Speakers)

When to Use This Skill

Use when you want to build a real-time, conversational voice agent.
Use when working with the Pipecat framework for pipeline-based AI.
Use when you need to integrate multiple providers (Google and OpenAI) into a single voice loop.
Use when building Iron Man-themed or tactical-themed voice applications.

How It Works

Step 1: Install Dependencies

You will need the Pipecat framework and its service providers installed:

pip install pipecat-ai[openai,google,silero] python-dotenv

Step 2: Configure Environment

Create a .env file with your API keys:

OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key

Step 3: Run the Agent

Execute the provided Python script to start the interface:

python scripts/friday_agent.py

Core Concepts

Pipeline Architecture

The agent follows a linear pipeline: Mic -> VAD -> STT -> LLM -> TTS -> Speaker. This allows for granular control over each stage, unlike end-to-end speech-to-speech models.

Google Compatibility Shim

Since Google's Gemini API has a different message format than OpenAI's standard (which Pipecat aggregators expect), the script includes a GoogleSafeContext and GoogleSafeMessage class to bridge the gap.

Best Practices

✅ Use Silero VAD: It is robust for local hardware and prevents background noise from triggering the LLM.
✅ Concise Prompts: Tactical agents should give short, data-dense responses to minimize latency.
✅ Sample Rate Match: OpenAI TTS outputs at 24kHz; ensure your audio_out_sample_rate matches to avoid high-pitched or slowed audio.
❌ No Polite Fillers: Avoid "Hello, how can I help you today?" Instead, use "Systems nominal. Ready for commands."

Troubleshooting

Problem: Audio is choppy or delayed.
- Solution: Check your OUTPUT_DEVICE index. Run a script like test_audio_output.py to find the correct hardware index for your OS.
Problem: "Validation error" for message format.
- Solution: Ensure the GoogleSafeContext shim is correctly translating OpenAI-style dicts to Gemini-style schema.

Related Skills

@voice-agents - General principles of voice AI.
@agent-tool-builder - Add tools (Search, Lights, etc.) to your Friday agent.
@llm-architect - Optimizing the LLM layer.

Limitations

Use this skill only when the task clearly matches the scope described above.
Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.