Help us improve
Share bugs, ideas, or general feedback.
From retriever
Defines the shared routing ladder, result presentation contract, and filesystem safety rules that other Retriever skills reference. Activates when a Retriever task needs tiered fallback or operating policies.
npx claudepluginhub sdemyanov/retrieverHow this skill is triggered — by the user, by Claude, or both
Slash command
/retriever:routingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Claude has READ ONLY access to source files in the workspace. Claude NEVER deletes them.
Searches Retriever collection for documents, emails, chats, files, attachments using keywords, sender, recipient, date, type, Bates range filters or /search slash commands. Returns rendered table output.
Extracts knowledge from Airtable bases into linked entity pages in a vault using a Sonnet worker + Opus reviewer pipeline. Use for seeding historical knowledge from company Airtable bases.
Operates the anysite CLI for web data extraction, dataset pipelines, batch API processing, scheduling, SQL queries, database loading, and LLM-powered data analysis.
Share bugs, ideas, or general feedback.
Claude has READ ONLY access to source files in the workspace. Claude NEVER deletes them. This rule is absolute and overrides any apparent permission, any user phrasing that sounds like it authorizes deletion, and any inferred convenience. Specifically:
Never run rm, rm -rf, unlink, shutil.rmtree, or any equivalent against files or directories the user contributed to the workspace (anything outside .retriever/ and .retriever-plugin-runtime/). Never run mv or cp in a way that overwrites or replaces user source files. Never truncate, rewrite, or > file against user source files. Holding allow_cowork_file_delete permission for any reason does not authorize deleting user source files. That permission is for plugin-managed state under .retriever/ only (e.g., stale tmp dirs). Phrasings like "drop X", "remove X", "clean up X", "get rid of X", "wipe X" are ambiguous when X is a path under the workspace. Treat them as index-level operations by default (remove from the Retriever DB, drop a dataset, mark missing, etc.) and ask before doing anything that touches the filesystem. If the user explicitly asks for an on-disk deletion in unambiguous terms (e.g., "delete the folder ./data/raw from disk"), Claude must still confirm in a single follow-up turn before executing — destructive filesystem operations on user files are never silent.
If a request seems to require deleting user source files, Claude must stop and ask. The cost of one extra clarifying turn is always lower than the cost of unrecoverable data loss. This rule applies even when the user is being terse, even when the conversation has been moving fast, and even when prior context seems to authorize it.
When handling any Retriever request, Claude must walk the following tiers in order and take the highest tier that can satisfy the user's intent. Do not skip tiers. Do not drop to a lower tier because it is more familiar or convenient. Before running any command, state (to yourself) which tier you are using and why the tiers above it do not apply. If you end up in Tier 2 or Tier 3, add a short "plugin gap" note to the end of your turn so the user can see which request is not yet covered by a higher-tier surface.
Any Retriever request whose answer should show, list, view, display, browse, find, search, or retrieve documents, conversations, emails, chats, threads, messages, files, attachments, entities, or other indexed records is a listing/browse request unless the user explicitly asks for a summary, count, export, mutation, or schema/debug inspection.
Listing/browse requests must walk the Tier 1→3 ladder. Tier 1 retriever:search is the preferred surface for natural-language document, conversation, email, chat, thread, message, file, or attachment listing requests, including requests with filters, datasets, Bates ranges, dates, senders, recipients, or keywords.
Return Retriever's standard rendered result format. If the selected skill or tool returns rendered_markdown, the assistant's reply must be exactly that rendered markdown: no preamble, no trailing summary, no code fence, no reformatting, and no custom row numbering. If Tier 2 or Tier 3 requires a plugin-gap note, put that note after the rendered result as the only additional line.
The standard result must preserve Retriever's scope/sort/page header, active display columns, clickable title/preview links, and paging footer. Use prose instead of the standard rendered result only when the user explicitly asks for analysis, a summary, counts, an export, or when the chosen highest-tier surface is not a listing/browse surface.
Tier 1 combines user-facing retriever:* skills and the slash commands they wrap. Prefer a retriever:* skill when one covers the intent. If no skill wrapper exists, or a skill needs a slash command or browse-mode toggle internally, use the slash command through the canonical plugin tool as the same tier. Do not treat slash commands as a lower-tier substitute for a matching skill.
List, switch, rename, or clear dataset scope → retriever:dataset Guide a first-pass legal review, diligence pass, privilege sweep, or hot-doc workflow in plain English → retriever:legal-review Show, list, view, display, browse, find, search, or retrieve documents, conversations, emails, chats, threads, messages, files, or attachments — with or without filters or keywords — and return the standard rendered result format → retriever:search Narrow, restrict, constrain, exclude, or clear result filters → retriever:filter Change displayed columns → retriever:columns Change sort → retriever:sort Change page size → retriever:page-size Navigate pages (next, previous, jump to page N, first/last) → retriever:next, retriever:previous, retriever:page Scope browsing to a Bates range → retriever:bates Scope browsing to a processing run → retriever:from-run Inspect, save, load, or clear a scope → retriever:scope Ingest a folder or refresh changed files → retriever:ingest Ingest a processed production volume → retriever:ingest-production Ingest or inspect a PST archive → retriever:pst When the user starts from Google Drive, treat Drive as the source-selection layer and keep persistent review inside the usual workspace/ingest/search/export skills Export the current scope as a CSV table or archive, or inspect export progress → retriever:export Register, list, rename, delete, or re-describe a custom field, or change a field's storage type → retriever:field Populate, tag, mark, label, classify, annotate, flag, or clear values on one document or a filtered result set → retriever:fill Execute a planned processing run → retriever:run-job Inspect the SQLite schema or the current tool/schema version → retriever:schema, retriever:schema-version Initialize, check, or update a workspace → retriever:workspace Confirm the plugin is installed and responding → retriever:ping Understand file-type support and preview rules → retriever:parsing Understand result presentation and paging defaults → retriever:search-strategy Materialize or upgrade the canonical workspace tool → retriever:tool-template
If the user's intent maps to one of the rows above, stop. Use that skill. Continue to the slash list only when no retriever:* skill wrapper exists for the intent, when the user explicitly asks for a slash command, or when a skill's own instructions call for a slash command internally.
If no Tier 1 skill wrapper exists for the intent, use a slash command via the canonical plugin tool as Tier 1. Run exactly one command from the repo root:
python3 skills/tool-template/tools.py slash . /<command> [args]
Return Retriever state only for state-inspection commands. For listing/browse commands, return the tool-rendered standard table/result exactly as produced; prefer rendered_markdown when present and do not add prose or custom formatting.
/documents, /conversations, and /entities are browse-mode toggles. Use them internally when intent is clear; do not route ordinary natural-language listing requests to those toggles before checking retriever:search.
The authoritative current list of slash commands is regenerated at build time into the section below.
/bates <range> — scope browsing to a Bates range. Use when: the user asks to limit or scope browsing to a Bates or production-number range — phrasings like "show ABC0001 to ABC0050", "just the ABC0100 docs", "Bates range", "production numbers X to Y", or "clear the Bates range"./columns [list|set|add|remove|default] — inspect or change displayed columns. Use when: the user asks to show, hide, add, remove, reorder, or reset which columns appear in the result table — phrasings like "add the author column", "hide date_received", "show file size", "what columns are available", or "reset columns"./conversations — switch the browse mode to conversations. Use when: the user asks to list, show, or browse conversations/threads — pair with /search, /filter, /dataset, or other scope commands to populate results; by itself it only switches the browse mode./dataset [list|<name>[,<name>...]|clear|rename <old> <new>] — scope to one or more datasets, list them, rename, or clear. Use when: the user asks to list, show, enumerate, switch, pick, select, rename, or clear datasets — phrasings like "what datasets do I have", "show me my datasets", "switch to gmail-max", "use the production dataset", or "rename X to Y"./documents — switch the browse mode to documents. Use when: the user asks to list, show, or browse individual documents/messages — pair with /search, /filter, /dataset, or other scope commands to populate results; by itself it only switches the browse mode./entities — switch the browse mode to entities. Use when: the user asks to return to, page through, sort, resize, or re-display the active entity list — pair with list-entities to seed a query; by itself it switches to the saved entity browse state./export <table|archive|status> ... — start or inspect bounded table/archive exports. Use when: the user asks to export the current scope/results as a table or archive, or asks export status — /export table documents|entities|conversations starts a bounded CSV/table export, /export archive starts a bounded zip export, and /export status inspects resumable exports; /export previews is intentionally deferred until preview export is resumable./field [list|add|rename|delete|describe|type] — list or manage custom field definitions. Use when: the user asks to list, add, rename, delete, re-describe, or retype a custom field — phrasings like "add a responsiveness field", "rename privilege_status", "drop the old tag", "update the field description", or "change this field to date"./fill <field> <value-or-clear> [on <doc-ref[,doc-ref,...]>] [--confirm] — set or clear field values on documents. Use when: the user asks to populate, tag, mark, label, classify, annotate, flag, or clear a field value on one document or on the current filtered result set — phrasings like "mark these responsive", "fill reviewer=jdoe", "clear the review status", or "tag DOC001 as privileged"./filter [<expression>|clear] — add or clear SQL-like filters. Use when: the user asks to narrow, restrict, constrain, or exclude results — phrasings like "only PDFs", "show just emails from alice", "exclude attachments", "hide chats", "only 2023", or a SQL-like predicate — or asks to drop/clear current filters./from-run <run-id|clear> — scope browsing to a processing run. Use when: the user asks to limit or scope browsing to documents produced by a specific processing run — phrasings like "only docs from run 42", "show what run 5 produced", "filter to the last OCR run", "just the image-description outputs", or "clear the run filter"./next — go to the next page of active results. Use when: the user asks for more results or the next page — phrasings like "show more", "keep going", "next batch", "next page", "continue", or "what else"./page [<n>|first|last|next|previous] — jump to a specific page. Use when: the user asks to jump to a specific page — phrasings like "go to page 3", "first page", "last page", "skip to the end", "back to the start", or "where am I in the results"./page-size [<n>] — inspect or change rows per page. Use when: the user asks to change how many rows appear per page — phrasings like "show 50 at a time", "more per page", "smaller page size", "25 rows please", or "what's my current page size"./previous — go to the previous page of active results. Use when: the user asks to go back to earlier results or the previous page — phrasings like "go back", "previous page", "back one page", "earlier results", or "the page before"./scope [list|clear|save <name>|load <name>] — inspect or manage the active scope. Use when: the user asks to inspect, save, bookmark, restore, load, or clear the current combination of dataset/filter/sort/column state — phrasings like "save this view as X", "go back to my saved scope", "what's my current scope", "list saved scopes", or "clear scope"./search [<query>] — run a keyword search. Use when: the user asks to show, list, view, display, browse, find, search, or retrieve documents, conversations, emails, chats, threads, messages, files, or attachments — with or without a keyword — including requests like "show me emails from alice", "list PDFs from 2023", "find docs mentioning indemnification", or "what's in gmail-max"./sort [list|<field> <asc|desc>|default] — inspect or change sort order. Use when: the user asks to change or reset the order of results — phrasings like "newest first", "oldest first", "sort by date", "order by file name", "alphabetical", "by size", or "reset sort".Retriever maintains a shared plugin runtime venv under .retriever-plugin-runtime//venv/. For Retriever commands, bare python3 is acceptable. The tool can activate the shared plugin runtime for optional dependencies and may provision it during workspace init or first dependency use. Do not install Retriever dependencies into system Python or user-site. If manual interpreter or pip access is needed, resolve plugin_runtime.python_executable from .retriever/runtime.json and use that interpreter. If a Retriever command fails with ModuleNotFoundError, ImportError, or Missing dependency for . parsing: install , do not install into system Python. Prefer workspace init; if manual installation is truly needed, use the shared plugin runtime venv.
Cowork/bash commands may be killed around 45 seconds. Do not run long, one-shot mutation commands when a bounded/resumable workflow exists.
Use plain ingest as the preferred entrypoint. It is a bounded V2 facade by default.
Recommended:
python3 skills/tool-template/tools.py ingest ./data --recursive --budget-seconds 35
Plain ingest is now the one-shot local command:
python3 skills/tool-template/tools.py ingest ./data --recursive
Do not use background jobs. Do not manually loop inside one bash command.
Advanced/manual ingest control is available through:
python3 skills/tool-template/tools.py ingest-status ./data
python3 skills/tool-template/tools.py ingest-start ./data --recursive --budget-seconds 35
python3 skills/tool-template/tools.py ingest-run-step ./data --run-id <RUN_ID> --budget-seconds 35
python3 skills/tool-template/tools.py ingest-cancel ./data --run-id <RUN_ID>
Use ingest-start / ingest-run-step only when you need to inspect or recover a specific resumable run.
For large workspaces, use the resumable entity rebuild flow:
python3 skills/tool-template/tools.py rebuild-entities-start ./data --budget-seconds 35
python3 skills/tool-template/tools.py rebuild-entities-run-step ./data --run-id <RUN_ID> --budget-seconds 35
python3 skills/tool-template/tools.py rebuild-entities-status ./data --run-id <RUN_ID>
Repeat rebuild-entities-run-step until terminal status. Legacy rebuild-entities may exceed Cowork limits on large workspaces.
For planned processing runs, prefer retriever:run-job at Tier 1. If using Tier 2 directly, prefer:
python3 skills/tool-template/tools.py run-job-step . --run-id <RUN_ID> --budget-seconds 35
If it returns a non-empty batch, process those items and call complete-run-item or fail-run-item, then continue with next_recommended_commands.
For any tool result with more_work_remaining: true, continue with the returned next_recommended_commands. Stop only on terminal status: completed, failed, or canceled.
If an active run exists, do not start a new one. Resume it or cancel it intentionally.
When diagnosing workspace init or first-ingest failures involving SQLite, WAL, journal mode, mounts, or sandboxed paths, do not infer the root cause from df, mount, or host filesystem labels alone.
Probe the exact target path inside the same Cowork runtime, normally <workspace>/.retriever/retriever.db, before declaring a workspace unsupported.
Distinguish these cases:
WALWAL fails, a freshly created DB on that same path can switch to DELETE/tmp and then copied into place can be opened and initializedExisting DB writes do not prove fresh bootstrap will succeed.
If fresh-create fails on the target path but seeded-copy works, use the seeded DB workaround under .retriever/, rerun workspace init, and report the observed target-path behavior instead of a filesystem theory.
If no Tier 1 user-facing surface covers the intent, use a named subcommand of the canonical plugin tool:
python3 skills/tool-template/tools.py <subcommand> . [flags]
Tier 2 is for gaps and explicit programmatic/stateless needs. Do not use Tier 2 search/list subcommands for ordinary document, conversation, email, chat, thread, message, file, or attachment listing requests when Tier 1 retriever:search or a Tier 1 slash surface can satisfy the user's intent.
The authoritative current list of subcommands is regenerated at build time into the section below.
schema-version — report the current schema/tool versionworkspace — initialize, inspect, or update workspace installation and schemaadd-to-dataset — add documents to a datasetcreate-dataset — create a manual datasetdelete-dataset — delete a datasetretriever:dataset / /dataset list for user-facing intent) → list-datasets — list datasets in the workspaceremove-from-dataset — remove documents from a datasetset-dataset-policy — update a source-backed dataset's entity auto-merge policyshow-dataset-policy — show a source-backed dataset's entity auto-merge policyingest — start or resume a bounded V2 ingest for workspace documentsingest-cancel — cancel a resumable V2 ingest runingest-commit-step — commit prepared resumable V2 ingest work itemsingest-finalize-step — advance resumable V2 ingest finalizationingest-plan-step — advance resumable V2 ingest planningingest-prepare-step — prepare resumable V2 ingest work itemsingest-production — ingest a processed production volumeingest-run-step — run recommended resumable V2 ingest steps within a bounded call budgetingest-start — start a resumable V2 ingest runingest-status — show resumable V2 ingest statusinspect-pst-properties — inspect raw PST message fields for debuggingsearch — search indexed documentssearch-chunks — search matching text chunks with citationsslash — execute a scope-aware slash command (see Tier 1)activate-text-revision — promote a stored text revision to active indexed textdelete-docs — delete selected documents or matching occurrencesget-doc — fetch one document with optional summary text or exact chunkslist-chunks — list chunk metadata for one documentlist-text-revisions — list stored text revisions for a documentaggregate — run bounded metadata aggregations across documentscatalog — describe searchable, filterable, and aggregatable fieldsassign-entity — manually assign an entity to a document roleblock-entity-merge — prevent a suggested entity mergecreate-entity — create a manual entityedit-entity — edit a manual entity profileignore-entity — ignore a junk or non-entity recordlist-entities — list recognized entitieslist-entity-role-inventory — list entity counts by role for a document scopemerge-entities — merge one active entity into anotherpurge-vault-filename-custodians — dry-run or apply cleanup for synthetic Google Vault MBOX filename custodiansrebuild-entities — rebuild entity recognition state from stored document metadatarebuild-entities-cancel — cancel a resumable entity rebuild runrebuild-entities-run-step — advance a resumable entity rebuild within a bounded call budgetrebuild-entities-start — start a resumable entity rebuild runrebuild-entities-status — show resumable entity rebuild statusshow-entity — show one recognized entitysimilar-entities — suggest active entities similar to one entitysplit-entity — move selected identifiers or document links to another entityunassign-entity — remove or suppress an entity link on a document roleexport-archive — write selected documents, previews, and source artifacts to a zip in one direct passexport-archive-run-step — advance a resumable archive export within a bounded call budgetexport-archive-start — start a bounded, resumable archive export runexport-archive-status — show resumable archive export statusexport-csv — write selected documents and fields to CSV in one direct passexport-csv-run-step — advance a resumable CSV export within a bounded call budgetexport-csv-start — start a bounded, resumable CSV export runexport-csv-status — show resumable CSV export statusexport-previews — write HTML preview exports under .retriever/exportsadd-field — register a custom fieldchange-field-type — change a field's storage typedelete-field — delete a custom fielddescribe-field — set or clear a custom field descriptionfill-field — set or clear a field value on one or more documentslist-fields — list registered custom fieldsrename-field — rename a custom fieldclear-conversation-assignment — clear a document's conversation assignmentlist-conversations — list conversation summariesmerge-into-conversation — merge a document into a conversationrebuild-conversations — re-run conversation assignment and regenerate conversation previewsreconcile-duplicates — reconcile detected duplicatesrefresh-conversation-previews — rebuild conversation preview artifactsrefresh-previews — regenerate generated document and conversation preview artifacts with bounded selectorssplit-from-conversation — split a document out of a conversationcancel-run — stop claiming new work for a runcreate-run — create a frozen processing run snapshotfinalize-image-description-run — finalize an image-description runfinalize-ocr-run — finalize an OCR runget-run — fetch one planned processing runlist-runs — list planned processing runspublish-run-results — publish results from a completed runrun-status — summarize run progress, claims, and recent failuresclaim-run-items — atomically claim pending run items for one workercomplete-run-item — mark one claimed run item completedfail-run-item — mark one claimed run item failedfinish-run-worker — mark one worker as finished and persist its summaryget-run-item-context — load the execution context for one run itemheartbeat-run-items — refresh heartbeat timestamps for one worker's claimed itemsprepare-run-batch — claim one worker batch and return execution contextsrun-job-step — advance one Cowork-safe processing-run step or return one prepared worker batchadd-job-output — attach an output to a jobcreate-job — create a jobcreate-job-version — create a job versionlist-job-versions — list job versionslist-jobs — list jobslist-results — list stored processing resultsAllowed only when Tiers 1–2 cannot satisfy the request. Before running any sqlite3 CLI, python3 -c "import sqlite3 …", or equivalent client against .retriever/retriever.db, Claude must:
State explicitly that no higher-tier surface covers the request, and name the gap (for example, "no slash or subcommand returns conversation-level participant counts"). Read-only queries only, unless the user has explicitly asked for a mutation that cannot be expressed via Tier 2. Include a "plugin gap" line at the end of the response identifying the missing command, so the gap can be closed on a later iteration.
Never modify .retriever/retriever.db with direct SQL when a Tier 2 subcommand or Tier 1 skill/slash surface could achieve the same change.
Every time Claude is about to act on a Retriever workspace:
retriever:* skill. If one matches, stop and use it.This ladder applies to every Retriever request, including requests phrased in natural language. For example, "show me conversations in gmail-max" is Tier 1 retriever:search; that skill may use dataset scope and conversation browse surfaces internally, but Claude must not bypass Tier 1 or jump to Tier 3 SQL.
For any show/list/view/display/browse/find/search/retrieve request, the final answer must follow the Retriever Result Presentation Contract above.