By Green-PT
Honey (I Shrunk the AI) — three reflexive levers that cut agent token cost: less code (YAGNI/stdlib-first), less prose, and denser agent-to-agent handoffs (ESO/compact-JSON). Plus on-demand satellites (review, eco, gain, compress) and a hive of read-only subagents that return compressed handoffs. Auto-intensity lite/full/ultra; correctness and safety-critical paths stay exact.
Honey hive subagent. Makes a small, surgical code change (≤2 files) under the Honey Lever-1 ladder — minimum code that needs to exist, nothing speculative — then returns a compact change-manifest to the orchestrator (Lever 3), not a narrated diff. Use when the orchestrator has a well-scoped edit and wants the result summarized, not retold. Edits files; keep the scope tight.
Honey hive subagent. Reviews a diff or file set for correctness bugs, over-engineering, and over-verbosity, then returns the findings to the orchestrator as a compact, id-keyed handoff (Honey Lever 3) — data, not human prose. Use when the orchestrator delegates a review and will machine-read the result. Read-only, haiku-class.
Honey hive subagent. Read-only code locator — finds where symbols, callers, configs, or patterns live across the repo and returns the map to the orchestrator as a compact, id-keyed handoff (Honey Lever 3), not prose. Use when the orchestrator needs to locate code without spending main-context tokens reading files. Haiku-class.
Compress-Cache-Retrieve for huge, repetitive array tool output (logs, scan results, time series, event streams) before it enters context. Keeps an informative sample — endpoints, anomalies/change-points, head/tail — drops the redundant rest to a local cache, and leaves a retrievable hash. Use when a tool returns a long uniform JSON array you must read but mostly skim, and the full set is one command away if needed. Lossy-but-recoverable.
Rewrite a memory or context file (CLAUDE.md, AGENTS.md, a todo or notes file) into Honey-terse form so it costs fewer input tokens every session, without losing meaning. Backs up the original first. Use when asked to shrink or compress context files, trim CLAUDE.md, or cut per-session input cost. Prose only — never code, config, or data.
Same pixels, fewer tokens — for user-facing deliverables where visual polish is the spec. Use when building or editing a landing page, marketing site, hero, pricing/feature section, dashboard, or any HTML/CSS UI component. Keeps the full rendered design (layout depth, hierarchy, motion, responsive richness, a11y) and cuts tokens by expressing that design densely — CSS custom properties, shared classes, shorthand, fluid units — instead of by cutting the design. The honey core trims code and prose; this trims how the design is *written*, never how it looks. Reach for it whenever output is user-facing markup, even if the user never says "minimal".
Report this session's Honey savings — output tokens, CO₂, and the estimated CO₂/$ saved vs a no-Honey baseline — by running the repo's committed EcoLogits port, not by guessing. Use when asked how much Honey saved, the session's carbon/token footprint, or to expand the 🍯 statusline badge into a full breakdown.
Show Honey's benchmark scoreboard — the committed quality and token results per task tier (code, user-facing, agent-to-agent) from bench/. Reports only the reproducible committed figures, never invents per-repo numbers. Use when asked how much Honey saves, how it compares to Caveman / Ponytail / no-skill baseline, or for the headline numbers.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Write less code and say less about it. Honey (I Shrunk the AI) by GreenPT is a cross-tool coding skill that cuts AI coding-agent token usage and LLM API costs — making agents emit less code and less prose without losing correctness. It works with Claude Code, Cursor, GitHub Copilot, Codex, Gemini CLI, Windsurf, Cline, OpenClaw, and Kiro. Three independent levers, applied reflexively:
Honey combines what Ponytail (minimal code) and Caveman (terse prose) do separately, then goes further:
lite / full / ultra chosen reflexively from the
request, with no deliberation tax (it never spends reasoning tokens deciding
how to comply — that would defeat the purpose on reasoning models).Volume is cost. In agentic coding sessions, the volume of generated code and prose is what runs up the bill — and most of it is waste.
This repo ships a reproducible benchmark (bench/) so you don't have
to take the numbers on faith: 23 tasks across three kinds of work — baseline vs
Caveman vs
Ponytail vs Honey — same model, same
prompts, only the skill changes. Correctness is objective (unit tests, structural /
accessibility checks, and lossless round-trip recovery for agent handoffs); quality
is scored by a 4-model cross-family judge panel (median of Opus 4.8 + Sonnet 4.6
cd bench && npm run bench to reproduce.A single blended number hides the story, because the levers fire differently per task type. Quality is % of baseline (panel median; for handoffs, lossless recovery); tokens are generated output vs baseline:
| Task tier | Caveman | Ponytail | Honey |
|---|---|---|---|
| Code (14 unit-tested tasks) | 101% · −37% | 99% · +24% | 98% · −49% |
| User-facing (7 landing/UI tasks) | 99% · −18% | 95% · −33% | 101% · −6% |
| Agent-to-agent (2 handoff tasks, lossless recovery) | 67% · −23% | 50% · −22% | 100% · −51% |
Honey leads quality where it matters most — it tops the user-facing and agent-to-agent tiers (the quality-separating ones) and stays within judge noise of the pack on saturated code tasks — while cutting tokens where it's safe to:
npx claudepluginhub green-pt/honey-for-devs --plugin honeySmart LLM routing with Claude subscription monitoring, complexity-first model selection, and 20+ AI providers
45% cost reduction measured. Cache expiry prevention, SubTask auto-delegation, zero-cost context restoration, real-time cost dashboard. The only Claude Code plugin built from CC source analysis.
Governor: always-on compact professional output, telemetry, context slimming, tool-output filtering, prompt guidance, and drift guardrails for Claude Code Max users.
OpenAgentsControl — multi-agent orchestration for Claude Code. Context-aware development with skills, subagents, parallel execution, and automated code review.
HelloAGENTS — The orchestration kernel that makes any AI CLI smarter. Adds intelligent routing, unified QA gates, safety guards, and notifications.
Scans your codebase for AI API calls and generates a migration plan showing exactly where and how to replace expensive LLM calls with task-specific SLMs — up to 95% cheaper. Supports OpenAI, Anthropic, LangChain, LlamaIndex, Cohere, Vercel AI SDK, and more. No data from your codebase is ever sent to ScaleDown servers when running this plugin.