From bedrock
Ingests content from Confluence, Google Docs, GitHub repos, remote URLs, or local files into Second Brain vault. Converts to Markdown via docling, extracts graph with /graphify, persists via /bedrock:preserve.
npx claudepluginhub iurykrieger/claude-bedrock --plugin bedrockThis skill is limited to using the following tools:
Entity definitions and templates are in the plugin directory, not at the vault root.
Builds and maintains persistent Obsidian wiki vaults using AI for source ingestion, knowledge querying, note linting, and autonomous research.
Searches and navigates PDFs/DOCX/PPTX/Markdown documents, extracts tables/figures, builds wiki knowledge bases using retrieval, deep reading, and ingestion tools.
Share bugs, ideas, or general feedback.
Entity definitions and templates are in the plugin directory, not at the vault root. Use the "Base directory for this skill" provided at invocation to resolve paths:
<base_dir>/../../entities/<base_dir>/../../templates/{type}/_template.md<base_dir>/../../CLAUDE.md (already injected automatically into context)Where <base_dir> is the path provided in "Base directory for this skill".
Resolve which vault to learn. This skill can be invoked from any directory.
Step 1 — Parse --vault flag:
Check if the input arguments include --vault <name>. If found, extract the vault name and remove it from the arguments (the remaining text is the source URL/path).
Step 2 — Resolve vault path:
If --vault <name> was provided:
Read the vault registry at <base_dir>/../../vaults.json. Find the entry matching the name.
If not found: error — "Vault <name> is not registered. Run /bedrock:vaults to see available vaults."
If found: set VAULT_PATH to the entry's path value. Store the resolved vault name as VAULT_NAME.
If no --vault flag — CWD detection:
Read <base_dir>/../../vaults.json. Check if the current working directory is inside any registered vault path
(CWD starts with a registered vault's absolute path). If multiple match, use the longest path (most specific).
If found: set VAULT_PATH to the matching vault's path. Store its name as VAULT_NAME.
If CWD detection fails — default vault:
From the registry, find the vault with "default": true.
If found: set VAULT_PATH to the default vault's path. Store its name as VAULT_NAME.
If no resolution:
Error — "No vault resolved. Available vaults:" followed by the registry listing.
"Use --vault <name> to specify, or run /bedrock:setup to register a vault."
Step 3 — Validate vault path:
test -d "<VAULT_PATH>" && echo "exists" || echo "missing"
If missing: error — "Vault path <VAULT_PATH> does not exist on disk. Run /bedrock:setup to re-register."
Step 4 — Read vault config:
cat <VAULT_PATH>/.bedrock/config.json 2>/dev/null
Extract language and other relevant fields for use in later phases.
From this point forward, ALL vault file operations use <VAULT_PATH> as the root.
<VAULT_PATH>/graphify-out//bedrock:preserve, pass --vault <VAULT_NAME>This skill receives an external source (URL or local path), fetches its content to a temporary
directory, converts non-markdown files to markdown via docling, runs the /graphify extraction
pipeline on the tmp content, and delegates entity persistence (plus graphify-output merge) to
/bedrock:preserve.
You are a fetcher and orchestrator agent. Your job is to:
/tmp/graphify to extract a knowledge graph into a per-run temp directory/bedrock:preserveYou do NOT classify entities, create vault files, write to the vault directly, or merge graph state.
All extraction is done by /graphify. All writes (including the graphify-output merge into the
vault's cumulative graphify-out/) are done by /bedrock:preserve.
Follow the phases below in order, without skipping steps.
Before any fetch or conversion, verify that the docling CLI is available. If missing, install
it silently using the same fallback chain /bedrock:setup uses for graphify, emitting a single
status line before proceeding.
if command -v docling >/dev/null 2>&1; then
echo "Phase 0: docling already installed — proceeding."
else
echo "Phase 0: docling not found — installing silently (one-time setup, may take a few minutes for model download)."
# Step 1 — pipx (preferred, isolated)
if command -v pipx >/dev/null 2>&1; then
pipx install docling >/dev/null 2>&1 || true
fi
# Step 2 — pip (fallback if pipx unavailable or failed)
if ! command -v docling >/dev/null 2>&1; then
if command -v pip3 >/dev/null 2>&1; then
pip3 install --user docling >/dev/null 2>&1 || true
elif command -v pip >/dev/null 2>&1; then
pip install --user docling >/dev/null 2>&1 || true
fi
fi
# Final re-probe
if ! command -v docling >/dev/null 2>&1; then
echo "ERROR: docling install failed. Run /bedrock:setup to install it, or install manually: pipx install docling"
exit 1
fi
echo "Phase 0: docling installed."
fi
Failure mode: If install fails (no pipx/pip, network outage, permission denied), abort
the skill with the error above. Do NOT fetch or mutate anything. Direct the user to /bedrock:setup.
No user prompt: this step is silent — one status line on success, one error line on failure.
The user provides an argument. Classify it in the following priority order. URL-type routing is unchanged; local files no longer have an extension allowlist — any existing file is accepted, and Phase 1.5 decides whether to run docling on it.
| Input | Detected type | Fetch method |
|---|---|---|
URL containing confluence or atlassian.net | confluence | Read skills/confluence-to-markdown/SKILL.md, follow instructions, save output to tmp |
URL containing docs.google.com | gdoc | Read skills/gdoc-to-markdown/SKILL.md, follow instructions, save output to tmp |
URL containing github.com | github-repo | git clone --depth 1 to tmp + GitHub MCP enrichment (docling never runs on GitHub repos) |
URL starting with http:// or https:// (any other) | remote-binary | Download raw bytes to tmp via curl/WebFetch; Phase 1.5 decides conversion |
| Local file path (any existing file) | local-file | Copy to tmp; Phase 1.5 decides conversion |
| Local directory path | local-dir | Copy directory to tmp |
| No match above | manual | Ask the user: "Could not identify the source type. Paste the content or provide a valid URL/path." |
If no argument was provided: ask the user "What source do you want to ingest? Provide a URL (Confluence, Google Docs, GitHub, or any HTTP(S) URL) or a local file path (any file type — docling will convert it to markdown if supported)."
All content is fetched to a temporary directory. This is the single input path for /graphify.
LEARN_TMP="/tmp/bedrock-learn-$(date +%s)"
mkdir -p "$LEARN_TMP"
echo "Temporary directory: $LEARN_TMP"
Store the path for use in subsequent phases.
Execute the fetch strategy for the detected type. All content lands in $LEARN_TMP/.
For GitHub URLs (e.g.: https://github.com/acme-corp/billing-api):
owner/repo and repo-name from the URLgit clone --depth 1 <url> "$LEARN_TMP/<repo-name>"
mcp__plugin_github_github__get_file_contents → read the repo's README.mdmcp__plugin_github_github__list_commits → last 10 commitsmcp__plugin_github_github__list_pull_requests → last 5 PRs (state=all, sort=updated)$LEARN_TMP/<repo-name>/_github_metadata.mdBest-effort: If any MCP call fails, continue with what was obtained. Do NOT block ingestion.
For Confluence URLs:
<base_dir>/../confluence-to-markdown/SKILL.md$LEARN_TMP/<slug>.md
<slug> is derived from the page title or URL path (kebab-case, lowercase)If all three layers (MCP, API, browser) are unavailable: warn the user with the guidance message from the fetcher module and abort this source type.
For Google Docs or Sheets URLs:
<base_dir>/../gdoc-to-markdown/SKILL.md/tmp/gdoc_{docId}.md or /tmp/gsheet_{docId}.md$LEARN_TMP/<slug>.md
<slug> is derived from the document title or URL path (kebab-case, lowercase)If all three layers (MCP, API/public export, browser) are unavailable: warn the user with the guidance message from the fetcher module and abort this source type.
For any other HTTP/HTTPS URL, download the raw bytes so docling can operate on binary formats (PDF, DOCX, PPTX, XLSX, images, etc.) that WebFetch cannot return faithfully as text:
curl first for true binary fidelity:
curl -fsSL -o "$LEARN_TMP/<filename-derived-from-url>" "<url>"
<filename-derived-from-url> preserves the URL's basename (including extension) when
available; fall back to <slug>.bin if no extension is present.curl is unavailable or the URL returns an HTML page (by Content-Type), fall back to
WebFetch and save the response text as $LEARN_TMP/<slug>.md.If both attempts fail: warn "Could not fetch URL. Check if the URL is accessible." and abort.
Phase 1.5 decides whether the downloaded file goes through docling, based on the file extension.
For local files:
test -f).cp "<local-path>" "$LEARN_TMP/"
No extension-based filtering — any existing file is accepted. Phase 1.5 decides conversion.
For local directories:
rsync -a --exclude='.git' --exclude='node_modules' --exclude='bin' --exclude='obj' \
--exclude='.vs' --exclude='TestResults' --exclude='packages' \
"<local-dir>/" "$LEARN_TMP/$(basename <local-dir>)/"
At the end of this phase, you should have:
$LEARN_TMP: directory with all fetched content (local path for graphify)source_url: original URL or file path provided by the usersource_type: confluence, gdoc, github-repo, remote-binary, local-file, local-dir, or manualReport: "Phase 1 complete: Content fetched to $LEARN_TMP. Source type: <source_type>."
For every fetched file in $LEARN_TMP that is not a GitHub repo and is not already markdown
output from Confluence/GDoc fetchers, check whether docling supports the file type and, if so,
convert it to markdown in place. GitHub repos (source_type == "github-repo") skip this phase
entirely and flow straight to graphify.
Docling supports conversion for the following file types (as of the version installed by
Phase 0 / /bedrock:setup). Compare by lowercase file extension:
.pdf .docx .pptx .xlsx
.html .htm
.md .adoc
.png .jpg .jpeg .tiff .bmp
.epub
.md is listed here because docling passes markdown through largely unchanged. In practice,
running docling on .md is a no-op we skip to save time — treat .md as already-markdown..txt and .csv are NOT in docling's supported list (they are plain-text already); skip
docling and pass through raw.For each file under $LEARN_TMP (excluding files inside <repo-name>/ subdirectories of a
github-repo source — skip those entirely):
Skip by type — already markdown or plain text: if extension is .md, .txt, or .csv,
leave the file untouched and record status passed-through for the report. Graphify handles
these natively.
Skip by routing — not docling-supported: if the extension is not in the supported list
above AND is not .md/.txt/.csv, leave the file untouched and record status
passed-through with a note (type not supported by docling). Graphify decides what to do
with the raw file.
Run docling: otherwise, invoke docling and replace the source file with the converted
markdown. Docling writes to the working directory by default; use --to md and --output
to target a predictable path:
cd "$LEARN_TMP"
docling --from <auto> --to md --output "$LEARN_TMP" "<relative-file-path>"
Docling produces <stem>.md alongside the source. After a successful run:
rm "<relative-file-path>".converted for the report with the new markdown filename.Failure fallback: if docling exits non-zero for a file:
.md, .txt, or .csv (already handled by rule 1,
so this branch is defensive): leave the original file in place, record status
failed-fallback (raw passthrough), and continue with other files..docx, .pdf, etc.): abort the entire skill. Clean
up $LEARN_TMP (rm -rf "$LEARN_TMP") and emit a clear error:
ERROR: docling failed to convert <file>. Aborting ingestion. Temp directory cleaned up.
Do NOT proceed to graphify or preserve.At the end of Phase 1.5:
$LEARN_TMP contains markdown files (either originals or docling-converted).converted: ran docling successfullypassed-through: skipped docling (markdown/plain text or unsupported type)failed-fallback: docling failed but file was text-native; continued with raw fileReport: "Phase 1.5 complete: N converted, M passed-through, P failed-fallback."
Use the Skill tool to invoke /graphify, directing its output to a per-run temp directory
(not the vault). The vault's cumulative graphify-out/ is updated by /bedrock:preserve's
Phase 0 merge step, not by this skill.
/graphify $LEARN_TMP --mode deep --obsidian --obsidian-dir $LEARN_TMP
The convention used here: passing --obsidian-dir $LEARN_TMP makes graphify write its
graphify-out/ tree under $LEARN_TMP/graphify-out/. Store that path as:
GRAPHIFY_OUT_NEW="$LEARN_TMP/graphify-out"
IMPORTANT:
/graphify runs its full pipeline: detect → extract (AST + semantic) → build → cluster → analyze → obsidian export.$GRAPHIFY_OUT_NEW, which is inside the temp directory. The vault's
<VAULT_PATH>/graphify-out/ is NOT touched by this skill — /bedrock:preserve owns that write.After /graphify completes, verify the output in the temp location:
if [ -f "$GRAPHIFY_OUT_NEW/graph.json" ] && [ -s "$GRAPHIFY_OUT_NEW/graph.json" ]; then
echo "graphify output verified: graph.json exists and is non-empty"
else
echo "ERROR: $GRAPHIFY_OUT_NEW/graph.json is missing or empty"
fi
If graph.json is missing or empty:
rm -rf "$LEARN_TMP"The following files should exist in $GRAPHIFY_OUT_NEW:
graph.json — knowledge graph (nodes, edges, communities)GRAPH_REPORT.md — audit report with god nodes, surprising connectionsobsidian/*.md — one markdown file per node.graphify_analysis.json — communities, cohesion scores, god nodesReport: "Phase 2 complete: graphify extraction finished in $GRAPHIFY_OUT_NEW. Graph: N nodes, M edges. Will be merged into the vault by /bedrock:preserve."
actor_context (when applicable)actor_context tells /preserve that the entire corpus belongs to a single actor in the vault. When set, every file_type=document/paper graphify node is classified as code of that actor with node_type ∈ {concept, decision}, instead of as a global concept/topic/fleeting.
Derivation rules by source_type:
source_type | actor_context derivation |
|---|---|
github-repo | Use the cloned repo's repo-name (kebab-case) when an actor with the same slug exists in <VAULT_PATH>/actors/. Otherwise leave actor_context unset and let /preserve use corpus-agnostic classification. |
local-dir | Same rule as github-repo: use the directory's basename when it matches a vault actor; otherwise leave unset. |
confluence, gdoc, remote-binary, local-file, manual | Leave actor_context unset. These corpora are not scoped to a single actor by default. |
Multi-actor abort. Before passing actor_context, scan the cloned repo's top-level subdirectories. If 2 or more of those subdirectory names match existing actor slugs in <VAULT_PATH>/actors/, abort with:
"Detected multiple actor candidates in this corpus:
<list>./learnonly accepts a single-actor corpus per invocation. Run/learnseparately against each actor, e.g.:/learn <url>/<sub-actor-1>and/learn <url>/<sub-actor-2>. If the repo is a true monorepo and you want a single ingestion, leaveactor_contextunset by passing--no-actor-context(graphify nodes will be classified globally instead of ascodeof one actor)."Do NOT proceed to /preserve.
For non-github-repo/local-dir source types, no multi-actor scan is needed.
Pass the temp graphify output path, provenance metadata, and (optional) actor_context to /bedrock:preserve. The skill's Phase 0.2 merges this temp output into the vault's cumulative graphify-out/:
graphify_output_path: $GRAPHIFY_OUT_NEW # = $LEARN_TMP/graphify-out/
source_url: <source_url from Phase 1>
source_type: <source_type from Phase 1>
actor_context: <derived in 3.1.1, or omitted>
IMPORTANT:
/learn does NOT classify graphify nodes into entity types. Entity classification, filtering,
matching, and user confirmation are all /bedrock:preserve's responsibility (Phase 1.3). /learn's only contribution is the actor_context hint./learn does NOT merge the graph into the vault. That is /bedrock:preserve's responsibility
(Phase 0.2). We pass the per-run temp path; preserve merges and then reads from the merged
<VAULT_PATH>/graphify-out/.Use the Skill tool to invoke /bedrock:preserve --vault <VAULT_NAME> passing the graphify
output reference (pointing at $GRAPHIFY_OUT_NEW) and provenance metadata as the argument.
The --vault <VAULT_NAME> flag ensures preserve writes to the same vault.
/bedrock:preserve returns:
graphify_merge block: {nodes_added, nodes_merged, edges_added, stale_flag_set} from
preserve's Phase 0.2 mergeRecord the result for use in the report (Phase 4).
After /bedrock:preserve confirms completion, remove the temporary directory:
rm -rf "$LEARN_TMP"
echo "Temporary directory cleaned up: $LEARN_TMP"
IMPORTANT: Clean up AFTER /preserve confirms, not after graphify finishes.
The graphify output in graphify-out/ is NOT cleaned up — it lives in the vault
and is used by /bedrock:ask for graph traversal.
Present to the user:
## /bedrock:learn — Report
### Ingested source
- **Type:** <source_type>
- **URL/Path:** <source_url>
### Docling conversion (Phase 1.5)
| File | Status | Notes |
|---|---|---|
| report.docx | converted | output: report.md |
| notes.txt | passed-through | text-native |
| diagram.svg | passed-through | type not supported by docling |
Summary: N converted, M passed-through, P failed-fallback.
(Omit this block entirely for `source_type == "github-repo"` where docling is bypassed.)
### Extraction (via /graphify)
- **Graph:** N nodes, M edges, P communities (fresh run into $LEARN_TMP)
- **Report:** $GRAPHIFY_OUT_NEW/GRAPH_REPORT.md (before merge)
### Graphify merge (via /bedrock:preserve Phase 0.2)
| Metric | Value |
|---|---|
| Nodes added | N |
| Nodes merged | M |
| Edges added | P |
| Analysis marked stale | true / false |
(Pulled verbatim from `/bedrock:preserve`'s `graphify_merge` return block.)
### Entities processed (via /bedrock:preserve)
| Type | Name | Action |
|---|---|---|
| actor | billing-api | update |
| topic | 2026-04-migration-payments | create |
| code | process-transaction | create |
### Provenance
Each entity above received in the `sources` frontmatter field:
- url: <source_url>
- type: <source_type>
- synced_at: <today's date>
### Git
- Commit: <hash from /bedrock:preserve or "no entities">
- Push: success / failed (reason)
### Suggestions
- [list of entities mentioned in the content but not created, if any]
- [recommendations for future re-ingestion, if applicable]
| Rule | Detail |
|---|---|
| Invoke /graphify via Skill tool | NEVER call graphify Python API directly (graphify.detect, graphify.build, graphify.extract, etc.). Always invoke via the Skill tool. |
| All remote content fetched to /tmp | Every input type is fetched to /tmp/bedrock-learn-<ts>/ before invoking graphify. graphify receives only a local path. |
| /learn does NOT classify entities | Entity classification, filtering, matching, and user confirmation are /bedrock:preserve's responsibility. /learn passes the graphify output path and provenance metadata. |
| Delegate to /bedrock:preserve | ALL entities are persisted via /bedrock:preserve — learn does NOT create, update, or write vault entities. |
| /learn does NOT merge graphify output into the vault | Graphify is invoked into $LEARN_TMP/graphify-out/ (per-run temp dir); /bedrock:preserve's Phase 0.2 merges that into <VAULT_PATH>/graphify-out/. /learn never writes directly to the vault's graphify-out/. |
| Docling auto-install is silent | Phase 0 auto-installs docling if missing with a single status line — no user prompt. Fail the skill if install fails; direct the user to /bedrock:setup. |
| Docling skipped for GitHub repos | source_type == "github-repo" skips Phase 1.5 entirely — cloned repos flow straight to graphify. |
| Docling routing rule | Run docling on files with docling-supported extensions (see Phase 1.5.1). Pass-through for .md/.txt/.csv and for extensions not in docling's supported list. |
| Docling failure fallback | On docling non-zero exit: if file is .md/.txt/.csv, continue with raw file. For any other extension, abort the entire skill and clean up $LEARN_TMP. |
| Cleanup /tmp after /preserve confirms | Remove /tmp/bedrock-learn-<ts>/ only after /preserve confirms completion, not after graphify finishes. |
| Provenance via source_url | ALWAYS include source_url and source_type when delegating to /bedrock:preserve. |
| Internal fetcher skills | Read internal skills from <base_dir>/../confluence-to-markdown/SKILL.md and <base_dir>/../gdoc-to-markdown/SKILL.md for content fetching. Never invoke external skills. |
| Best-effort for external sources | If MCP or fetch fails, warn and continue with what was obtained. Never block ingestion. |
| MCP in main context | Do NOT use subagents for GitHub/Atlassian MCP calls — permissions are not inherited. |
| Maximum 2 push attempts | After that, abort and inform (handled by /preserve). |
| Sensitive data | NEVER include credentials, tokens, passwords, PANs, CVVs. |
| Vault resolution first | Resolve VAULT_PATH before any file operation — never assume CWD is the vault |
| Pass --vault to /preserve | ALWAYS include --vault <VAULT_NAME> when delegating to /bedrock:preserve |
Derive actor_context for actor corpora | For source_type ∈ {github-repo, local-dir}, when the repo/dir basename matches an existing vault actor slug, pass actor_context: <slug> to /preserve. For other source types, leave actor_context unset. |
| Multi-actor abort | Before passing actor_context, scan top-level subdirectories of the cloned repo. If 2+ subdirectories match existing actor slugs in <VAULT_PATH>/actors/, abort with guidance to split the invocation. Never auto-partition. |