From podcast-creator
Generate episode show notes (JSON) — timecoded transcript, title, duration, two-sentence summary, and date — by sending the final mixed audio and the transcript to Gemini.
How this skill is triggered — by the user, by Claude, or both
Slash command
/podcast-creator:metadata-generationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate a structured JSON file of show notes for a podcast episode. The final
Generate a structured JSON file of show notes for a podcast episode. The final mixed audio file and the original transcript are sent back to Gemini, which aligns the two and returns a timecoded transcript plus episode metadata.
This is a pipeline producer skill: the orchestrator (podcast-studio) invokes
it after the audio mix, passing the run directory via --workspace and the
GEMINI_API_KEY in the subprocess environment. It is brand-neutral — episode
title, language, and tone are decided upstream by the active show profile and
captured in the transcript and audio it receives; this skill only describes
what is already there.
generate_metadata.py imports google-genai + pydub. These live in the
orchestrator's uv-managed venv (provisioned at first run, R70) — they are
not pip installed into the system Python (PEP 668 refuses that on modern
macOS/Debian). The orchestrator creates the venv with
uv venv "${XDG_DATA_HOME:-$HOME/.local/share}/podcast-creator/venv" and installs
the deps with uv pip install --python "<venv>/bin/python" pydub pyyaml "google-genai>=2.0.1" (see podcast-studio SKILL.md First run + Step 0).
Interpreter (R70). Run this script through the venv interpreter the orchestrator resolved in Step 0 (
"$PODCAST_PY" <script>), not barepython3— thepython3 …in the examples is shorthand. (pydubalso needsffmpegon PATH, a separate system binary.)
Read from the run directory passed as --workspace:
{workspace}/data/script.md{workspace}/audio/final/episode.mp3 (falls back to
episode.wav if the mp3 is absent).pydub reads the audio length to produce an
MM:SS duration hint for the model.gemini-3-flash-preview) via the
generate_content API with the uploaded audio plus the transcript text.{workspace}/data/show_notes.json.python3 "${CLAUDE_SKILL_DIR}/scripts/generate_metadata.py" --workspace <run-dir>
| Argument | Default | Description |
|---|---|---|
--workspace | workspace | Run directory holding data/ and audio/. The orchestrator passes the active run dir (default ./podcast-output/<slug>/). |
A JSON file at {workspace}/data/show_notes.json with this structure:
{
"show_title": "...",
"show_duration": "...",
"two_sentence_summary": "...",
"date_of_generation": "YYYY-MM-DD",
"timecoded_transcript": [
{
"timecode": "MM:SS",
"speaker": "...",
"text": "..."
}
]
}
show_notes.json is the input to cover-image-generation (via its
--metadata flag), so the title written here drives the cover.
{workspace}/data/script.md → prints an error and
exits without writing output.episode.mp3 nor episode.wav) → prints an error
and exits.Builds a throwaway prototype to answer a design question about UI appearance or state/logic behavior. Guides you through two branches: interactive terminal app for logic validation, or multiple UI variations for visual exploration.
npx claudepluginhub cmgramse/skill-development --plugin podcast-creator