By flight505
AI-powered image, diagram, and video generation for Claude Code - uses Nano Banana 2 (Gemini 3.1 Flash Image) for fast generation, Nano Banana Pro (Gemini 3 Pro Image) for professional diagrams, and Veo 3.1 for video generation.
Generate publication-quality technical diagrams using Nano Banana Pro (gemini-3-pro-image) with AI-powered quality review. Smart iteration only regenerates when quality is below threshold. Supports style presets (technical, visual-abstract, minimal), aspect ratio, and resolution control (512-4K).
Generate and edit images using Nano Banana 2 (gemini-3.1-flash-image, fastest) or Nano Banana Pro. Supports aspect ratio and resolution control via Google GenAI SDK.
Render text-based diagrams (Mermaid, PlantUML, GraphViz, D2, and 23 more) to PNG/SVG via Kroki.io. Use ONLY when the user explicitly asks for text-based diagram rendering or a specific diagram language.
Generate videos using Veo 3.1 — text-to-video, image-to-video, frame interpolation, and video extension
Create Nature-quality visual abstracts — scientific figures using visual metaphors, isometric depth, and physical analogies to convey complex technical systems. Use for README hero images, paper figures, blog graphics, or when the user wants diagrams that go beyond boxes and arrows. Triggers on: 'visual abstract', 'scientific figure', 'Nature-quality', 'publication graphic', 'infographic', 'visual metaphor', or requests for rich/expressive/artistic diagrams.
Executes bash commands
Hook triggers when Bash tool is used
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
AI-powered image, diagram, and video generation for Claude Code
Using Google Gemini API and Veo 3.1 with intelligent quality review
--style technical|visual-abstract|minimal via system_instruction — aesthetics separated from contentclient.chats.create()git clone https://github.com/flight505/nano-banana.git
uv sync # or: pip install google-genai
Get a key at Google AI Studio:
export GEMINI_API_KEY='your-gemini-key-here'
Or use a .env file in your project:
echo "GEMINI_API_KEY=your-key-here" > .env
# Technical diagram with quality review
python3 skills/diagram/scripts/generate_diagram.py \
"Microservices architecture with API gateway" \
-o architecture.png \
--doc-type architecture
# Creative image
python3 skills/image/scripts/generate_image.py \
"A cozy coffee shop on a rainy day" \
-o coffee_shop.png
# Edit an existing image
python3 skills/image/scripts/generate_image.py \
"Add rain to the window" \
--input coffee_shop.png -o rainy_coffee_shop.png
# Generate a video
python3 skills/video/scripts/generate_video.py \
"A time-lapse of clouds over a mountain" \
-o clouds.mp4
# Image-to-video (animate a still image)
python3 skills/video/scripts/generate_video.py \
"Slowly pan across the scene" \
--input landscape.png -o animated.mp4
Generate publication-quality technical diagrams with AI quality review and style presets.
# Standard technical diagram
python3 skills/diagram/scripts/generate_diagram.py "User authentication flow" -o auth.png --doc-type architecture
# Visual abstract with dark background and glow
python3 skills/diagram/scripts/generate_diagram.py "API gateway as routing prism" -o visual.png --style visual-abstract --doc-type journal
# Wide diagram with aspect ratio
python3 skills/diagram/scripts/generate_diagram.py "System overview" -o overview.png --aspect-ratio 16:9
Document Types:
| Type | Threshold | Use Case |
|---|---|---|
specification | 8.5/10 | Technical specs, PRDs |
architecture | 8.0/10 | System architecture |
journal | 8.5/10 | Academic papers |
presentation | 6.5/10 | Slides (faster) |
readme | 7.0/10 | Documentation |
Generate and edit images using various AI models.
# Generate
python3 skills/image/scripts/generate_image.py "Abstract art in blue and gold" -o art.png
# Edit existing image
python3 skills/image/scripts/generate_image.py "Make the sky purple" --input photo.jpg -o edited.png
Available Models:
gemini-3.1-flash-image (default — Nano Banana 2, fastest)gemini-3-pro-image (Nano Banana Pro, highest quality)Aspect Ratio & Resolution:
# Generate with specific aspect ratio and resolution
python3 skills/image/scripts/generate_image.py \
"A wide cinematic landscape" -o landscape.png \
--aspect-ratio 16:9 --resolution 2K
Render text-based diagrams (Mermaid, PlantUML, GraphViz, D2, and 23 more) to PNG/SVG.
# Render Mermaid to PNG
python3 skills/kroki/scripts/render_diagram.py -t mermaid -o flow.png \
--source 'flowchart LR; A-->B-->C'
# Render PlantUML to SVG
python3 skills/kroki/scripts/render_diagram.py -t plantuml -i diagram.puml -o diagram.svg
npx claudepluginhub flight505/nano-bananaUse this agent when creating visual narratives, designing infographics, building presentations, or communicating complex ideas through imagery. This agent specializes in transforming data and concepts into compelling visual stories that engage users and stakeholders. Examples:\n\n<example>\nContext: Creating app onboarding illustrations
Edit and render videos with FFmpeg and Remotion, from stitching and transitions to captions and teasers. Design presentations in Excalidraw and generate AI-powered infographics.
Generate publication-quality academic diagrams, statistical plots, and presentation slides using PaperBanana multi-agent framework
OpenAI GPT Image 2 prompt gallery, image prompt library, agentic skill + CLI. Gallery-first prompt patterns, reference-image editing workflows, and a packaged CLI for skill-capable agent runtimes.
Editorial-quality technical and product diagrams — 13 types rendered as standalone HTML with inline SVG, skinnable to match your brand
Multi-channel visual production skill with brand-token enforcement and 8:1 contrast minimum.
Complete SOTA 2026 Storybook assistant with Vision AI design-to-code, natural language generation, AI-powered accessibility remediation, React Server Components, AI visual regression testing, design token sync, usage analytics, dark mode generation, and comprehensive testing (Storybook 10, React 19, Next.js 15)
Comprehensive project planning and architecture research skills for Claude Code - generates software architecture documents, sprint plans, building blocks, service cost analysis, and implementation roadmaps backed by real research.
Long-running agent loop for Claude Code, in the Ralph pattern — a stable prompt, a mutable plan, and a loop that runs until the plan is empty.
Helper plugin for Claude Code's autonomous primitives — /goal, /branch, worktrees, dynamic workflows. Wraps native features with opinionated defaults and safety guardrails.
Claude Code skill pack for Langfuse LLM observability (24 skills)