From axon
Self-hosted RAG engine, web toolkit, and persistent agent memory. Use it to: remember durable project facts/preferences/decisions (memory.remember); recall/search agent memory (memory.search/show); answer questions indexed docs/code might cover (ask — a large corpus is already indexed; try ask BEFORE web-searching or giving up); search the web (search, auto-indexes results); semantic-search the index (query); scrape/fetch a page (scrape); crawl a docs site or pages you just used (crawl); map a site's URLs (map); extract structured data (extract); discover API endpoints (endpoints); extract brand identity — colors/logo/fonts/voice (brand); summarize a page (summarize); quick multi-source research (research); retrieve a URL's full indexed content (retrieve); embed local files/dirs (embed); ingest GitHub/GitLab/Gitea/Git repos, Reddit, YouTube, and Claude/Codex/Gemini sessions (ingest). Also triggers on axon, RAG, Qdrant, SearXNG, Tavily, hybrid/vector search, remember, recall, memory.
How this skill is triggered — by the user, by Claude, or both
Slash command
/axon:using-axonThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
axon is a self-hosted RAG engine. Two surfaces, same backend
axon is a self-hosted RAG engine. Two surfaces, same backend (Spider.rs/Chrome -> Qdrant, SQLite jobs, SearXNG/Tavily for web search):
mcp__plugin_axon_axon__axon, routed by action (and subaction for lifecycle families). Large results are written to a local artifact file under ~/.axon/artifacts/<context> and the response returns the file path plus a compact shape summary; small results (≤ ~8 KB) come back inline. See Response handling (MCP) below.axon <command> [flags]. Use for shell scripting, cron, or when the MCP server is down.Both surfaces accept the same operations and parameters. This skill leads with MCP request shapes; CLI equivalents are listed alongside.
axon already has a large corpus indexed, and every operation makes it smarter — so route web and knowledge work through it instead of one-off web fetches, raw browser tools, or giving up after a few tries. When a task involves the web or "what does X say," axon is the default tool, even if the user didn't name it.
| The task / user wants… | Action |
|---|---|
| An answer that indexed docs or code might cover | ask FIRST — before web search, raw fetching, or fumbling. Often returns everything in one shot. |
| Search the web | search (SearXNG/Tavily; auto-indexes every result) |
| Semantic search over what's indexed | query |
| Fetch / scrape a page or URL | scrape |
| Crawl a docs site — including docs you just relied on to solve something | crawl |
| List a site's URLs | map |
| Pull structured data out of a page | extract |
| Discover a site's API endpoints | endpoints |
| Brand identity (colors, logo, fonts, voice) | brand |
| Summarize a page | summarize |
| Quick multi-source research with synthesis | research |
| Full indexed content of a specific URL | retrieve |
| Embed local files / directories | embed |
| Index a GitHub/GitLab/Gitea/Git repo, Reddit, YouTube, or AI sessions | ingest |
| Remember a durable project fact, decision, or preference | memory.remember |
| Recall previously stored agent memory | memory.context at task start, or memory.search then memory.show when targeted lookup is better |
ask is the highest-leverage habit. A huge amount is already indexed, so many multi-turn fumbles would have been a single ask call. Whenever a question could be covered by docs or code that's been indexed, try ask before web-searching or giving up. And after you crawl/scrape/ingest something to solve a task, you've made the index richer — prefer ask/query next time over re-fetching.
Only pass parameters the user explicitly asked for. Defaults exist for a reason — do NOT add max_pages, max_depth, render_mode, include_subdomains, since, before, hybrid_search, diagnostics, format, root_selector, exclude_selector, limit, collection, or any other knob unless the user named it or the task literally cannot complete without it. The example JSON blocks below show what's available, not what to send by default. A bare { "action": "scrape", "url": "…" } or { "action": "crawl", "urls": [...] } is almost always the right call. Same rule for the CLI: never add flags the user didn't ask for.
mcp__plugin_axon_axon__axon { "action": "doctor" } fails or the tool is missing).--cron-every-seconds/--cron-max-runs loop.In every other case, use the MCP tool.
URL or query → discover → fetch + embed → query / ask
| Starting point | Discover | Fetch + embed (auto-embeds into Qdrant) | Query |
|---|---|---|---|
| Single URL | — | action: "scrape", url | query / ask |
| Whole site / docs | action: "map", url | action: "crawl", urls | query / ask |
| Topic / question | action: "search", query (SearXNG/Tavily, auto-queues crawl) | (auto) | action: "ask", query |
| Existing local file/dir | — | action: "embed", input | query / ask |
| GitHub / GitLab / Gitea / Git repo | — | action: "ingest", source_type: "github", target: "owner/repo" | query / ask |
| Reddit thread or subreddit | — | action: "ingest", source_type: "reddit", target: "r/name" | query / ask |
| YouTube video | — | action: "ingest", source_type: "youtube", target: "<url>" | query / ask |
| Past Claude/Codex/Gemini sessions | — | CLI only: axon sessions | query / ask |
scrape, crawl, embed, and the ingest paths all auto-embed unless you set embed: false.
help and doctorOnce per session, confirm the live action map and that services are healthy:
{ "action": "help" }
{ "action": "doctor" }
help returns the full action/subaction map and current defaults — authoritative when names look wrong. doctor pings Qdrant, the embedding service (TEI), Chrome, and the Gemini headless LLM backend. It does not probe the web-search backend.
CLI equivalents: axon doctor. (No CLI help for the action map — use the MCP one.)
Use Axon memory for durable project-level facts, preferences, decisions, bugs, and task notes that should be semantically recalled later. This is distinct from bead issue notes and from Memos: Axon memory is optimized for agent recall through Qdrant.
Minimal remember call:
{ "action": "memory", "subaction": "remember", "body": "Memory content lives in Qdrant; SQLite holds graph metadata.", "project": "axon" }
Recall:
{ "action": "memory", "subaction": "context", "project": "axon", "query": "memory storage architecture" }
{ "action": "memory", "subaction": "search", "query": "where is memory stored", "project": "axon" }
{ "action": "memory", "subaction": "show", "id": "<memory-id>" }
Use memory.context at task start when project memory could help. It returns inline, defanged XML-wrapped content with trust="evidence_only" and supports limit plus token_budget.
Graph maintenance:
{ "action": "memory", "subaction": "link", "source_id": "<memory-id>", "target_id": "<memory-id>", "edge_type": "relates_to" }
{ "action": "memory", "subaction": "supersede", "source_id": "<replacement-memory-id>", "target_id": "<old-memory-id>" }
supersede hides the old memory from future memory.search results and records the replacement trail in SQLite.
CLI equivalents: axon memory remember "..." --project axon, axon memory context --project axon --query "...", axon memory search "..." --project axon, axon memory show <memory-id>, axon memory link <source-id> <target-id>, axon memory supersede <replacement-id> <old-id>.
{ "action": "search", "query": "rust async patterns", "search_time_range": "month" }
{ "action": "map", "url": "https://docs.example.com" }
{ "action": "research", "query": "kubernetes ingress patterns" }
search — web search via SearXNG (when AXON_SEARXNG_URL is set) or Tavily; auto-queues crawl jobs for results. search_time_range ∈ day|week|month|year.map — sitemap-first URL discovery, falls back to fetching the root page and extracting anchors. Fast.research — search + LLM synthesis in one shot.CLI: axon search "…", axon map <url> (use --map-fallback crawl only when you need a full Spider walk; the structure-fallback default is fast), axon suggest "…" (LLM-suggested URLs to crawl next; also the MCP suggest action).
{ "action": "scrape", "url": "https://example.com/article" }
{ "action": "scrape", "url": "https://example.com",
"root_selector": "article, main",
"exclude_selector": ".sidebar, .ads",
"format": "markdown" }
{ "action": "scrape", "url": "https://example.com", "format": "html", "embed": false }
{ "action": "crawl", "urls": ["https://docs.example.com"],
"max_pages": 200, "max_depth": 3, "include_subdomains": true }
{ "action": "crawl", "urls": ["https://docs.example.com"], "render_mode": "chrome" }
{ "action": "embed", "input": "./docs" }
{ "action": "embed", "input": "https://example.com" }
Render modes: http (fast, no JS), chrome (full browser), auto_switch (default — start HTTP, escalate to Chrome on JS gate).
Output formats: markdown (default), html, rawHtml, json.
CLI: axon scrape <url> / axon crawl <url> --max-pages N --max-depth N / axon embed <input>. Chrome rendering is pre-tuned and rarely needs overriding; the CDP endpoint is set via the AXON_CHROME_REMOTE_URL env var, not a CLI flag. Output dir defaults to .cache/axon-rust/output/ (env AXON_OUTPUT_DIR).
{ "action": "extract", "urls": ["https://example.com/pricing"],
"prompt": "Extract plan name, price, and features as JSON" }
{ "action": "extract", "subaction": "status", "job_id": "<uuid>" }
LLM-powered. Pass a natural-language prompt describing the schema you want.
CLI: axon extract <url> --query "…" (the --query flag carries the extraction prompt).
Page-level analysis actions — each takes a url (except diff) and a bare call is the right default:
{ "action": "summarize", "url": "https://example.com/long-article" }
{ "action": "endpoints", "url": "https://app.example.com" }
{ "action": "brand", "url": "https://example.com" }
{ "action": "diff", "url_a": "https://example.com/v1", "url_b": "https://example.com/v2" }
{ "action": "screenshot", "url": "https://example.com" }
summarize — fetch a page and return a concise summary (also accepts urls for several at once; root_selector/exclude_selector to scope).endpoints — discover a site's API endpoints by scanning its JavaScript bundles. Optional knobs (verify, capture_network, probe_rpc, first_party_only) — omit unless asked.brand — extract brand identity (colors, logo, fonts, voice/tone) from a URL.diff — compare two URLs (url_a, url_b); reports content/metadata/link changes.screenshot — full-page capture via headless Chrome (full_page, viewport, output optional).CLI: axon summarize <url> / axon endpoints <url> / axon brand <url> / axon diff <url-a> <url-b> / axon screenshot <url>.
{ "action": "ingest", "source_type": "github", "target": "owner/repo" }
{ "action": "ingest", "source_type": "github", "target": "owner/repo", "include_source": false }
{ "action": "ingest", "source_type": "gitlab", "target": "group/project" }
{ "action": "ingest", "source_type": "gitea", "target": "https://gitea.example.com/owner/repo" }
{ "action": "ingest", "source_type": "git", "target": "https://git.example.com/owner/repo.git" }
{ "action": "ingest", "source_type": "reddit", "target": "r/rust" }
{ "action": "ingest", "source_type": "reddit", "target": "https://reddit.com/r/rust/comments/abc123/..." }
{ "action": "ingest", "source_type": "youtube", "target": "https://youtube.com/watch?v=abc" }
Note:
source_type: "sessions"is not accepted over MCP (it's rejected server-side). Ingest local AI sessions with the CLIaxon sessionsinstead — see below.
source_type ∈ github | gitlab | gitea | git | reddit | youtube | sessions:
github / gitlab / gitea — hosted-forge repos, indexing code + metadata + issues/PRs (or merge requests). target is owner/repo (or a full URL for self-hosted instances).git — any generic Git remote by clone URL (code only, no forge API).reddit — a subreddit (r/name) or a specific thread URL.youtube — a video URL (transcript ingest).sessions — local Claude/Codex/Gemini session transcripts. CLI-only via axon sessions (enqueues a local sessions ingest job). The MCP ingest action rejects source_type: "sessions"; remote callers must use the prepared-documents HTTP endpoint (/v1/ingest/sessions/prepared).Lifecycle subactions (status, cancel, list, cleanup, clear, recover) work the same as crawl/embed/extract.
CLI: axon ingest <target> with source-specific flags. GitHub: --include-source/--no-source, --max-issues, --max-prs (default 100; 0 = unlimited). Reddit: --sort hot|top|new|rising, --time hour|…|all, --max-posts, --min-score, --depth, --scrape-links. Use axon sessions for Claude/Codex/Gemini local history (enqueues a local sessions ingest job). MCP sessions ingest is rejected to avoid scanning server-local history paths — remote callers use the /v1/ingest/sessions/prepared HTTP endpoint instead.
{ "action": "query", "query": "embedding pipeline", "limit": 10, "collection": "axon" }
{ "action": "query", "query": "rate limiting", "since": "7d" }
{ "action": "ask", "query": "How does axon handle Chrome auto-switching?" }
{ "action": "ask", "query": "...", "since": "7d" }
{ "action": "ask", "query": "...", "since": "2026-01-01", "before": "2026-03-01" }
{ "action": "ask", "query": "...", "diagnostics": true }
{ "action": "ask", "query": "...", "hybrid_search": false }
query — pure semantic vector search (top-K chunks).ask — RAG: retrieve, then synthesize an answer with citations.hybrid_search: false forces dense-only for A/B comparison or when sparse is misbehaving. Server default: env AXON_HYBRID_SEARCH.since/before) accept 7d, 30d, YYYY-MM-DD, or RFC3339. They filter on indexing date, not publication date.collection overrides the default axon collection per request (env AXON_COLLECTION).{ "action": "retrieve", "url": "https://example.com/article" }
CLI: axon query "…" / axon ask "…" --since 7d --diagnostics / axon retrieve <url>.
evaluate — { "action": "evaluate", "query": "<question>", "retrieval_ab": true } (CLI: axon evaluate "<question>" --retrieval-ab) compares hybrid-RAG vs dense-only, scored by a separate judge prompt on the configured LLM backend (accuracy/relevance/completeness).
{ "action": "sources" }
{ "action": "domains" }
{ "action": "stats" }
{ "action": "status" } // global queue snapshot
CLI: axon sources / axon domains / axon stats / axon status.
crawl, extract, embed, ingest are async by default. subaction defaults to start, so { "action": "crawl", "urls": [...] } is enough to enqueue; poll with subaction: "status". Full lifecycle subactions and CLI-only surfaces: references/async-job-lifecycle.md.
The server runs in-process, so responses are size-routed automatically: payloads ≤ ~8 KB come back inline in data; larger ones are written to a local artifact file and the response returns its path plus a compact shape summary. Read shape first — it usually answers the question (counts, status, the URLs touched).
When you need the content, reach for RAG, not the raw file: everything axon fetches is already embedded, so ask (synthesized answer), query (semantic chunks), or retrieve (a specific URL's indexed content) get you what you need without parsing megabytes of JSON/markdown. Open the artifact path from disk only as a last resort — e.g. you need the exact raw bytes the RAG path doesn't surface. There is no artifacts MCP action; don't send { "action": "artifacts", ... }. To force a payload in-band instead of a file, set response_mode: "inline" (or "auto_inline").
Full response-mode contract and the JSON-RPC error model: references/mcp-response-protocol.md.
axon://schema/mcp-tool — full JSON schema and routing contract (read this when you need exact field types/enums).ui://axon/status-dashboard — interactive MCP App widget for live queue status.This skill names only the handful of env vars that matter at the point of use (AXON_COLLECTION, AXON_SEARXNG_URL, AXON_HYBRID_SEARCH, AXON_OUTPUT_DIR, AXON_DATA_DIR, …). The full surface — every AXON_* env var and ~/.axon/config.toml tuning key — is documented authoritatively and is not duplicated here:
config.example.toml (repo root) — all config.toml tuning knobs with defaults..env.example (repo root) — service URLs, API keys, and secrets.docs/guides/configuration.md — full environment-variable reference + the two-layer (.env + config.toml) priority model.Priority: CLI flags > env vars > ~/.axon/config.toml > built-in defaults.
| Situation | Reach for |
|---|---|
| User pastes a single URL | action: "scrape" |
| User says "the docs", "the whole site" | action: "crawl" with max_pages + max_depth |
| User asks a question without naming a source | action: "ask" (retrieves over whatever's indexed) |
| User wants only recent content | ask / query with since: "7d" |
| User wants citations / verification | ask with diagnostics: true |
| Ranking looks wrong | Try hybrid_search: false and compare |
| Need entity/relationship reasoning | ask with diagnostics: true; graph retrieval is not available in the current runtime |
| Crawled the wrong thing | crawl subaction: "clear" or per-family cleanup |
| RAG quality regression | evaluate with retrieval_ab: true (or CLI axon evaluate <q> --retrieval-ab) |
| Debug "nothing happened" | action: "doctor" first, then action: "status" |
shape, then reach for RAG — not the raw artifact. path mode keeps multi-megabyte results out of the conversation; pull the content back with ask/query/retrieve (it's already embedded) rather than reading/grepping the file. There is no artifacts action — it was removed precisely because RAG is the intended way to get content.axon <cmd> --help output. Most CLI subcommands inherit the entire Chrome flag set even when irrelevant. Use the action tables here instead.help and doctor are cheap. Call help once per session to confirm the live action map; call doctor whenever something looks wrong.job_id; poll with subaction: "status". CLI users can pass --wait true for synchronous one-shots.--cache false; add --cache-http-only to force the HTTP path even when the cached job used Chrome.graph: true is deprecated. Use hybrid search diagnostics and source coverage checks for retrieval debugging.evaluate scores with a separate judge prompt (distinct role + reference material) on the same configured LLM backend. Available as both the MCP evaluate action and axon evaluate.subaction defaults to start for lifecycle families — { "action": "crawl", "urls": [...] } is enough to enqueue a job.npx claudepluginhub jmagar/dendrite --plugin axonCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.