Skill

book-analyzer

Analyzes EPUB/PDF books into structured chapter notes with key concepts extracted and synthesized via parallel agents.

npx claudepluginhub tuan3w/obsidian-vault-agent --plugin obsidian-vault-agent

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/obsidian-vault-agent:book-analyzer [path/to/book.epub or book.pdf]

User invocable

Model invocation disabled

Inline context

Default effort

Argument hint[path/to/book.epub or book.pdf]

Tool Access

This skill is limited to the following tools:

BashReadWriteEditGrepGlobAgentTodoWrite

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Supporting Files

README.mdagents/book-synthesizer.mdagents/chapter-analyst.mdagents/concept-extractor.mdagents/section-writer.mdexamples/sample-output.mdreferences/extraction-guide.mdreferences/note-templates.mdscripts/extract_epub.pyscripts/extract_pdf.sh

SKILL.md

437 lines · ~4.5k tokens

Similar Skills

self-learn

Autonomously extracts atomic concepts from EPUB books into Obsidian notes, validates against NotebookLM via autoresearch loop, and extends knowledge. Use for mastering book content.

9 files

brain-os

process

Extracts knowledge from source notes (papers, posts, books, lectures) into a vault's permanent knowledge base using evidence-based learning techniques.

11 tools

obsidian-vault-agent

ea-obsidian-world-builder

Builds Wikipedia-style Obsidian vaults from academic PDFs, extracting concepts into linked notes with atomic sentences and citations. Expands existing networks with new papers.

1 file

academicskills

Stats

LanguagePython

Stars22

Forks3

MaintenanceExcellent

Last CommitMar 30, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Full autonomous analysis of book files (EPUB/PDF) into structured notes. Extracts text, detects chapters, analyzes each chapter via parallel agents, synthesizes across chapters, and optionally integrates into an Obsidian vault with wikilinks and extracted Term notes.

<Use_When>

User provides a path to an EPUB or PDF book file
User says "analyze this book", "process this book", "read this book"
User wants structured notes from a book-length document (20+ pages) </Use_When>

<Do_Not_Use_When>

Short documents under 20 pages — use /paper or /process instead
User wants to read or search specific sections — just use Read tool directly
User wants EPUB output from markdown — wrong direction
URL-only input with no file on disk — use /research instead </Do_Not_Use_When>

<Why_This_Exists> Book notes that merely transcribe content are useless — the value is in synthesis. Most book notes in a vault end up sparse because processing a full book manually is exhausting. This skill automates the mechanical work (extraction, splitting, formatting) while delegating the intellectual work (synthesis, assessment, connections) to specialized agents with strict quality constraints. </Why_This_Exists>

<Execution_Policy>

Run extraction scripts first, verify output before proceeding to analysis
Delegate chapter analysis to parallel agents (one per chapter, all simultaneous)
Use Sonnet for chapter analysis, Opus for cross-book synthesis
Vault integration is conditional — detect .obsidian/ or CLAUDE.md in working directory
Report progress via TodoWrite at each stage transition
If extraction fails, stop and report — don't guess at content
Total chapters capped at 50 — if more, group adjacent chapters into batches </Execution_Policy>

Stage 1: PARSE ARGUMENTS AND EXTRACT

Parse $ARGUMENTS to get the book file path and optional flags.

Argument format:

$ARGUMENTS = "path/to/book.epub"                    # basic
$ARGUMENTS = "path/to/book.pdf --no-terms"          # skip term extraction
$ARGUMENTS = "path/to/book.epub --output ~/Desktop/" # custom output location

Parse logic:

Split $ARGUMENTS by spaces, treating quoted paths as single tokens
First non-flag token = file path (REQUIRED — stop if missing)
Optional flags: --no-terms (skip concept extraction), --output <dir> (custom output)
Detect format from file extension: .epub, .pdf, .mobi

Run extraction based on format:

For EPUB:

SKILL_DIR="$(dirname "$(readlink -f "$0" 2>/dev/null || echo "$0")")"
# Find skill directory — check common locations
for dir in .claude/skills/book-analyzer ~/.claude/skills/book-analyzer; do
  if [ -f "$dir/scripts/extract_epub.py" ]; then SKILL_DIR="$dir"; break; fi
done
WORK_DIR=".book-work-$(date +%s)"
python3 "$SKILL_DIR/scripts/extract_epub.py" "INPUT_PATH" "$WORK_DIR"

For PDF:

WORK_DIR=".book-work-$(date +%s)"
"$SKILL_DIR/scripts/extract_pdf.sh" "INPUT_PATH" "$WORK_DIR"

For MOBI: Convert to EPUB first with ebook-convert (calibre), then run EPUB extraction.

The work directory is created inside the vault root (e.g., .book-work-1709398200/). Clean it up after the pipeline completes.

After extraction: Read $WORK_DIR/metadata.json to verify success.

PDF fallback: If metadata shows "needs_fallback": true, use the Claude Read tool instead:

Read the PDF 20 pages at a time: Read(file_path="INPUT_PATH", pages="1-20"), then pages="21-40", etc.
Write each batch to $WORK_DIR/pages/page_batch_NNN.txt
Update metadata with actual content

Stage 2: STRUCTURE

Read metadata.json from the extraction output.

For EPUBs: Chapter structure comes directly from extraction (H1/H2 splitting). Proceed to Stage 3.

For PDFs: Chapter detection depends on extraction quality.

If pages have clear chapter headings, group pages into chapters
If no clear structure: Launch a single agent to read the first 5 pages and detect chapter boundaries:

Agent(
  subagent_type="general-purpose",
  model="haiku",
  prompt="Read the following text from the first 5 pages of a book. Identify chapter
  boundaries and return a JSON array of {title, start_page, end_page} objects.
  If no chapters are detectable, return [{title: 'Full Text', start_page: 1, end_page: LAST}].

  TEXT:
  [first 5 pages content]"
)

Update metadata with detected chapter list
Group page files into chapter files by concatenation

Result: A chapters/ directory with one file per chapter, and updated metadata.json.

Stage 3: ANALYZE (parallel, file-based)

Read the agent definition from agents/chapter-analyst.md in the skill directory.

Create analyses directory:

mkdir -p "$WORK_DIR/analyses"

For EACH chapter, launch a parallel agent that WRITES ITS OUTPUT TO A FILE:

Agent(
  subagent_type="general-purpose",
  model="sonnet",
  run_in_background=true,
  prompt="You are Chapter Analyst. Follow these instructions exactly:

  [INSERT FULL CONTENT OF agents/chapter-analyst.md HERE]

  BOOK CONTEXT:
  - Title: {title}
  - Author: {author}
  - This is Chapter {N} of {total_chapters}
  - Table of Contents: {chapter_list summary}

  CHAPTER FILE TO READ: {path to chapter .md file}
  OUTPUT_FILE: $WORK_DIR/analyses/ch{NN}_analysis.md

  CRITICAL: Read the chapter file, produce your analysis, then WRITE it to OUTPUT_FILE using the Write tool. Be detailed — include all interesting quotes, concrete examples, and anecdotes. The master agent will read your file later."
)

Parallel execution: Launch ALL chapter agents simultaneously with run_in_background=true. Wait for all to complete. Verify all analysis files exist in $WORK_DIR/analyses/.

Why file-based: Agent return messages get truncated by context compression. Writing to files preserves full detail — every quote, every example, every nuance. The master agent reads these files directly.

Chapter cap: If more than 50 chapters, batch adjacent chapters (2-3 per agent) to stay within limits.

Result: One analysis file per chapter in $WORK_DIR/analyses/, each containing detailed notes with quotes.

Stage 4: SYNTHESIZE + SECTION WRITE (parallel)

This stage runs the cross-book synthesizer AND section writers ALL IN PARALLEL.

Batching formula: Divide chapters into batches of ~5 chapters each. M = min(max(ceil(N/5), 1), 10) where N = total number of chapters.

Create sections directory:

mkdir -p "$WORK_DIR/sections"

Read agent definitions from agents/book-synthesizer.md and agents/section-writer.md.

Launch ALL of the following simultaneously (one message, all with run_in_background=true):

4a. Book Synthesizer (Opus)

Agent(
  subagent_type="general-purpose",
  model="opus",
  run_in_background=true,
  prompt="You are Book Synthesizer. Follow these instructions exactly:

  [INSERT FULL CONTENT OF agents/book-synthesizer.md HERE]

  BOOK METADATA:
  - Title: {title}
  - Author: {author}
  - Total chapters: {N}

  ANALYSIS FILES DIRECTORY: $WORK_DIR/analyses/
  Read ALL .md files in this directory.

  OUTPUT_FILE: $WORK_DIR/synthesis.md
  Write your synthesis to this file.

  IMPORTANT: Per-chapter detail is handled by section writer agents.
  Your job is ONLY cross-chapter patterns: Core Thesis, Deepest Insights,
  Chapter Map, Critical Assessment, Cross-Domain Connections."
)

4b. Section Writers (Sonnet, one per batch)

For each batch of ~5 chapters:

Agent(
  subagent_type="general-purpose",
  model="sonnet",
  run_in_background=true,
  prompt="You are Section Writer. Follow these instructions exactly:

  [INSERT FULL CONTENT OF agents/section-writer.md HERE]

  BOOK CONTEXT:
  - Title: {title}
  - Author: {author}

  YOUR ASSIGNED CHAPTER ANALYSES (read ALL of these):
  - $WORK_DIR/analyses/ch{NN}_analysis.md
  - $WORK_DIR/analyses/ch{NN}_analysis.md
  - ... (list all files in this batch)

  OUTPUT_FILE: $WORK_DIR/sections/part_{MM}.md

  CRITICAL: Read each chapter analysis file. Format ALL their content into
  vault-ready markdown preserving every key idea, every quote (in > blockquote
  format), every example, and every anecdote. Your job is FORMATTING, not
  summarizing. Write output to OUTPUT_FILE."
)

Example for a 20-chapter book (4 section writers):

Section Writer 1: ch01-ch05 → sections/part_01.md
Section Writer 2: ch06-ch10 → sections/part_02.md
Section Writer 3: ch11-ch15 → sections/part_03.md
Section Writer 4: ch16-ch20 + postscript → sections/part_04.md

Result: synthesis.md + sections/part_01.md through sections/part_MM.md, all written in parallel.

Stage 4.5: ASSEMBLE FINAL NOTE (lightweight concatenation)

The master agent reads all section files and the synthesis, then assembles the final note.

This step is mostly CONCATENATION, not generation. The section writers already produced vault-ready markdown. The master agent's job is:

Read $WORK_DIR/synthesis.md
Read ALL $WORK_DIR/sections/part_*.md files in order
Assemble the final note structure:
- Core Thesis (from synthesis)
- Deepest Insights (from synthesis)
- How the Argument Builds / Chapter Map (from synthesis)
- ---
- Chapter Notes: COPY section files VERBATIM — do NOT summarize or compress
- ---
- Critical Assessment (from synthesis)
- Cross-Domain Connections (from synthesis)
- Questions: extract from  HTML comments in section files, deduplicate, add synthesis questions
- Related Links: wikilinks to related vault notes
Extract key terms from  HTML comments for Stage 5 concept extraction

CRITICAL RULE: Chapter note content from section files must be copied VERBATIM into the final note. The assembly agent must NOT summarize, compress, or reduce the section content. If a section file has 15 bullets for a chapter, the final note has 15 bullets for that chapter.

Stage 4.6: VERIFY FINAL NOTE (quality gate)

Before proceeding to integration or cleanup, verify the assembled note:

Completeness check: Every chapter should have a ### Ch heading in the final note. Count the headings and compare to total chapters from metadata.
Quote format check: All quotes should use > blockquote syntax. Look for " patterns that aren't inside blockquotes.
Detail preservation: The Chapter Notes section should be substantially longer than the synthesis sections. If the chapter notes are shorter than the synthesis, something went wrong — the section content was compressed.
Section presence: Verify these sections exist: Core Thesis, Deepest Insights, Chapter Notes, Critical Assessment, Cross-Domain Connections, Questions.
Length sanity: Expect ~300-500 words per chapter analyzed. A 20-chapter book should produce ~6,000-10,000 words of chapter notes.

If any check fails, fix the issue using Edit before proceeding. Do NOT delete the work directory until verification passes.

Stage 5: INTEGRATE

Detect vault context: Check if .obsidian/ directory or CLAUDE.md file exists in the current working directory.

If in vault context (Obsidian):

Create book note via create-note.py:
```
python3 .claude/scripts/create-note.py book "{Title}" author="{Author}" year={YEAR}
```
Script outputs the created file path. The template handles frontmatter (id, dates, type) automatically.
Fill the book note body using Edit tool on the created file:
- Add processing_status: inbox to frontmatter
- Compose body sections: Core Thesis → Key Themes → Chapter Notes → Critical Assessment → Cross-Domain Connections → Questions
- Add topic tags to the 🏷️Tags line (e.g., #psychology, #startup)
- Use [[short-form wikilinks]] for all internal references

Extract concepts (unless --no-terms flag): Read agent definition from agents/concept-extractor.md, then:

Agent(
  subagent_type="general-purpose",
  model="sonnet",
  prompt="You are Concept Extractor. Follow these instructions exactly:

  [INSERT FULL CONTENT OF agents/concept-extractor.md HERE]

  VAULT ROOT: [current working directory]
  BOOK NOTE TITLE: {Title}

  ALL KEY TERMS FROM CHAPTER ANALYSES:
  [INSERT COLLECTED KEY TERMS HERE]

  Use create-note.py to create new Term notes:
    python3 .claude/scripts/create-note.py term 'Term Name' processing_status=processed
  Then Edit the created file to fill in definition, example, tags, and links.

  Search the vault for existing terms using Grep and Glob.
  Report what was found, created, and skipped."
)

Report results:
- Book note wikilink: [[{Title}]]
- Number of chapters analyzed
- Number of terms extracted (existing vs new)
- Any issues encountered

If NOT in vault context (plain markdown):

Assemble markdown file using the format from references/note-templates.md (Section 2)
Write to ./book-analysis-{slug}.md (or --output dir if specified)
Skip term extraction (no vault to search)
Report output file location

<Tool_Usage>

Bash: Run extraction scripts, run create-note.py, check tool availability
Read: Read extracted chapters, metadata.json, agent definitions, PDF pages (fallback)
Edit/Write: Fill note body after create-note.py creates the skeleton
Agent: Delegate to chapter-analyst (sonnet, parallel), book-synthesizer (opus, parallel), section-writer (sonnet, parallel), concept-extractor (sonnet)
Grep/Glob: Search vault for existing terms (used by concept-extractor agent)
TodoWrite: Report progress at each stage

Agent delegation pattern:

Read the agent .md file from agents/ directory
Include agent instructions verbatim in the prompt
Append book-specific context (metadata, chapter text, etc.)
For chapter analysis: run_in_background=true (parallel)
For synthesis + section writers: run_in_background=true (ALL launched in parallel in one message)
For concept extraction: run_in_background=false (sequential, after assembly)

Work directory cleanup: Do NOT delete $WORK_DIR until Stage 4.6 verification passes. If verification fails, the work directory is needed for debugging and fixing.

Finding the skill directory: The skill lives at .claude/skills/book-analyzer/ (project-level). To find it reliably:

SKILL_DIR=".claude/skills/book-analyzer"

Read agent definitions with: Read(file_path="$SKILL_DIR/agents/chapter-analyst.md") </Tool_Usage>

User: "/book-analyzer ~/Books/thinking-fast-and-slow.epub" - Detects EPUB format, runs extract_epub.py - Finds 38 chapters via H1 splitting - Launches 38 parallel chapter-analyst agents (sonnet) - Collects all analyses, launches book-synthesizer (opus) - Detects .obsidian/ → vault mode - Creates book note at notes/books/(Book) Thinking, Fast and Slow.md - Extracts 12 concepts, finds 4 existing in vault, creates 8 new Term notes - Reports: "Created book note with 38 chapters, 8 new terms, 4 linked existing terms" User: "/book-analyzer paper.pdf --no-terms" - Detects PDF, runs extract_pdf.sh - Quality OK (no fallback needed), 45 pages detected - No clear chapters → structure agent groups into 6 logical sections - 6 chapter-analyst agents + 1 book-synthesizer - --no-terms flag → skip concept extraction - Creates book note, reports results User: "/book-analyzer notes/short-article.pdf" - PDF is only 8 pages — this is not a book - Should suggest: "This document is only 8 pages. Consider using /paper for academic papers or /process for existing notes instead." User: "/book-analyzer https://example.com/book.pdf" - URL, not a file path — no file on disk - Should suggest: "Please provide a path to a local file. Download the PDF first, then run /book-analyzer on the downloaded file."

<Escalation_And_Stop_Conditions>

Missing file: If the input path doesn't exist, stop immediately and report
Missing pandoc/pdftotext: If required tool isn't installed, stop and provide install instructions
Extraction failure: If extraction script fails, report the error — don't proceed with empty content
PDF fallback too large: If PDF is >500 pages and needs fallback (Read tool), warn user it will take a while and ask to proceed
No chapters detected: If structure detection fails completely, proceed with full text as single chapter (degraded but functional)
Agent failure: If a chapter-analyst agent fails, report which chapter failed and continue with others — partial results are still valuable
Too short: If extracted text is under ~5000 words, suggest /paper or /process instead </Escalation_And_Stop_Conditions>

<Final_Checklist>

$ARGUMENTS

book-analyzer

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

book-analyzer

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Stage 1: PARSE ARGUMENTS AND EXTRACT

Stage 2: STRUCTURE

Stage 3: ANALYZE (parallel, file-based)

Stage 4: SYNTHESIZE + SECTION WRITE (parallel)

4a. Book Synthesizer (Opus)

4b. Section Writers (Sonnet, one per batch)

Stage 4.5: ASSEMBLE FINAL NOTE (lightweight concatenation)

Stage 4.6: VERIFY FINAL NOTE (quality gate)

Stage 5: INTEGRATE

If in vault context (Obsidian):

If NOT in vault context (plain markdown):

Similar Skills

Help us improve

Stage 1: PARSE ARGUMENTS AND EXTRACT

Stage 2: STRUCTURE

Stage 3: ANALYZE (parallel, file-based)

Stage 4: SYNTHESIZE + SECTION WRITE (parallel)

4a. Book Synthesizer (Opus)

4b. Section Writers (Sonnet, one per batch)

Stage 4.5: ASSEMBLE FINAL NOTE (lightweight concatenation)

Stage 4.6: VERIFY FINAL NOTE (quality gate)

Stage 5: INTEGRATE

If in vault context (Obsidian):

If NOT in vault context (plain markdown):