Help us improve
Share bugs, ideas, or general feedback.
From obsidian-vault-agent
Extracts knowledge from source notes (papers, posts, books, lectures) into a vault's permanent knowledge base using evidence-based learning techniques.
npx claudepluginhub tuan3w/obsidian-vault-agent --plugin obsidian-vault-agentHow this skill is triggered — by the user, by Claude, or both
Slash command
/obsidian-vault-agent:process (no args — reads current note context from user)(no args — reads current note context from user)This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
<Purpose>
Builds Wikipedia-style Obsidian vaults from academic PDFs, extracting concepts into linked notes with atomic sentences and citations. Expands existing networks with new papers.
Searches an Obsidian vault exhaustively on a topic, surfaces non-obvious patterns and contradictions, and challenges the user to generate original synthesis.
Autonomously extracts atomic concepts from EPUB books into Obsidian notes, validates against NotebookLM via autoresearch loop, and extends knowledge. Use for mastering book content.
Share bugs, ideas, or general feedback.
This skill is INTERACTIVE by design. The desirable-difficulty questions in Stage 3 are not optional ceremony — the friction IS the learning. The user's answers shape what gets extracted and how.
<Use_When>
<Do_Not_Use_When>
Ask the user which note to process if it's not clear from context. Then read it.
Verify the note type is a source type: paper, post, book, lecture,
course, or a Clipping. If the note is a Term, Note, or Thought, stop and
explain why this workflow doesn't apply.
Read the frontmatter and body:
mcp__obsidian-vault__read_note(path="path/to/note.md")
Or if MCP is unavailable:
Read("/absolute/path/to/note.md")
Update frontmatter to signal active work:
mcp__obsidian-vault__update_frontmatter(
path="path/to/note.md",
updates={"processing_status": "processing", "updated_date": "YYYY-MM-DD"}
)
Do NOT skip this stage. Present these three questions and wait for the user's answers before proceeding:
Q1. In one sentence — what is the single most important idea in this note? (Don't look at the note. Reconstruct it from memory. That struggle is the point.)
Q2. How does this connect to something you already know? Name a specific vault note, or a concept from a completely different domain.
Q3. What surprised you, or pushed back against something you already believed?
Wait for the user's responses. Their answers determine:
If the user skips a question or gives a short answer, acknowledge it and move on — don't block the workflow.
Search the vault thoroughly for concepts mentioned in the note. Use multiple strategies:
By keyword (MCP preferred):
mcp__obsidian-vault__search_notes(query="CONCEPT NAME", limit=10)
By keyword (Grep fallback):
Grep(pattern="concept name", path="notes/", glob="*.md", output_mode="files_with_matches", head_limit=15)
By type — check all Term notes in relevant domain:
Grep(pattern="type: term", path="notes/ml/", glob="*.md", output_mode="files_with_matches")
For each key concept extracted from the note (aim for 5–12 concepts), classify it:
| Classification | Meaning | Action |
|---|---|---|
| EXISTING | Term/Note already in vault | Add [[wikilink]] to source note |
| NEW TERM | Atomic concept, one clear definition | Extract as new (Term) note |
| NEW NOTE | Multi-facet idea, needs more depth | Extract as new (Note) note |
| SKIP | Tangential, not reusable | Don't extract |
Search ACROSS domain folders — a psychology concept may already exist under a different framing in notes/startup/ or notes/design/. Cross-domain matches are the most valuable finds.
Present the plan clearly before creating anything:
## Existing concepts — add [[wikilinks]]
- [[(Term) Attention Mechanism]] — mention it in the Architecture section
- [[(Paper) Attention Is All You Need]] — link in Related links
## New concepts to extract
### 1. (Term) Flash Attention
- Type: Term
- Folder: notes/ml/
- Core idea: memory-efficient attention that avoids materializing the full N×N matrix
- Why it matters: enables training on longer sequences with same GPU budget
- Tags: #ml #attention #efficiency #MM-YYYY
### 2. (Note) IO-Aware Algorithm Design
- Type: Note (too broad for a Term)
- Folder: notes/computer science/
- Core idea: design algorithms around memory bandwidth, not just FLOPs
- Cross-domain angle: same principle appears in database query planning
- Tags: #systems #algorithms #MM-YYYY
## Questions this note raises
- Why doesn't standard PyTorch implement this by default?
- → Could become a (Question) note
Ask the user: "Which concepts should I create? (say 'all', list numbers, or adjust the plan)"
For each approved concept, create the note. Use this exact structure:
For Term notes:
---
id: YYYYMMDDHHMMSS
type: term
processing_status: processed
created_date: YYYY-MM-DD
updated_date: YYYY-MM-DD
---
# Concept Name [ ](#anki-card)
- **🏷️Tags** : #term #all-anki #topic-tag #MM-YYYY
[One sentence that captures the core intuition — lead with WHY it matters, not just what it is]
- [Bullet 1: concrete mechanism or definition]
- [Bullet 2: why it matters, what problem it solves]
- [Bullet 3: concrete example — make it visualizable]
- [Bullet 4: cross-domain connection or counterintuitive angle]
## Related
- Source: [[(Type) Source Note Title]]
- [[(Term) Related Concept]]
For Note entries (richer, multi-section):
---
id: YYYYMMDDHHMMSS
type: note
processing_status: processed
created_date: YYYY-MM-DD
updated_date: YYYY-MM-DD
---
# Concept Name
- **🏷️Tags** : #note #topic-tag #MM-YYYY
[ ](#anki-card)
[Lead insight — the one thing someone should remember]
## [Section 1]
- [bullets]
## [Section 2]
- [bullets]
## Related links
- Source: [[(Type) Source Note Title]]
- [[(Term) Related Concept]]
Generate the timestamp ID for each note:
date +%Y%m%d%H%M%S
Place notes in the appropriate subfolder under notes/. Match the domain:
notes/ml/notes/paper/ (or stay in the source's folder)notes/psychology/notes/startup/notes/computer science/notes/thoughts/Quality gate before creating each note:
After creating new notes, update the source note:
Add [[wikilinks]] inline where concepts are first mentioned — use the full
(Type) Title prefix that matches the created filename exactly:
[[(Term) Flash Attention]] not [[Flash Attention]][[(Note) IO-Aware Algorithm Design]] not [[IO-Aware]]Update frontmatter:
mcp__obsidian-vault__update_frontmatter(
path="path/to/source-note.md",
updates={"processing_status": "processed", "updated_date": "YYYY-MM-DD"}
)
Done. Created 3 notes, updated 2 wikilinks.
New notes:
- notes/ml/(Term) Flash Attention.md
- notes/computer science/(Note) IO-Aware Algorithm Design.md
- notes/ml/(Term) Tiling (GPU).md
Added wikilinks to source note for 5 concepts (2 existing, 3 new).
Cross-domain connection worth exploring:
- [[(Note) IO-Aware Algorithm Design]] ↔ database query planning (no vault note yet)
User: "process this paper on Flash Attention"
1. Read note → confirmed type: paper, status: inbox
2. Mark → processing_status: processing
3. Questions → user answers: "main idea is recomputing activations to save memory",
connects to "(Note) Memory Hierarchy", surprised that IO is the bottleneck not compute
4. Scan vault → finds (Term) Attention Mechanism (existing), (Term) Softmax (existing),
no term for Flash Attention, no term for tiling
5. Plan → 3 new terms, 2 existing links, 1 question note candidate
6. Create → (Term) Flash Attention, (Term) Kernel Fusion, (Note) IO-Aware Algorithm Design
7. Update → adds 5 wikilinks to source note, marks processed
User: "what can I learn from this blog post?" (post about Stripe's pricing strategy)
1. Read → type: post, about willingness-to-pay segmentation
2. Questions → user: "main idea is pricing tiers create self-selection",
connects to "(Term) Price Discrimination" already in vault
3. Scan vault → finds (Term) Price Discrimination, (Term) Bundling,
nothing on willingness-to-pay metering
4. Plan → 1 new term (Metered Pricing), link to 2 existing terms
5. Create → (Term) Metered Pricing with cross-domain link to SaaS AND utility billing
6. Update → processed, 3 wikilinks added
- Skipping Stage 3 questions and going straight to extraction — violates the
generation effect principle; the user misses the retrieval practice
- Creating terms that are too broad: "(Term) Machine Learning" is not atomic
- Wikilinks without (Type) prefix: [[Flash Attention]] instead of [[(Term) Flash Attention]]
- Terms that transcribe the paper instead of synthesizing: quoting abstract bullets verbatim
- Leading with definition instead of insight: "Flash Attention is an algorithm that..."
instead of "Attention's real bottleneck is memory bandwidth, not FLOPs — Flash Attention
fixes this by never materializing the full N×N matrix"
<Escalation_And_Stop_Conditions>
$ARGUMENTS