From paperloom
Fetches paper via URL, arXiv ID, DOI, or PDF path and generates 4-section triage summary (Key Takeaways, Background, Main Idea & Summary, Critique). Adds to vault if new.
npx claudepluginhub trapoom555/claude-paperloom<url | arxiv-id | doi | pdf-path># /paperloom:ingest
Fast, triage-grade ingest. `$ARGUMENTS` is the paper reference.
**Division of labor**: the deterministic steps (fetch, parse, template fill, edge aggregation, logging, stub creation, citation matching) are done by Python scripts in `${CLAUDE_PLUGIN_ROOT}/scripts/`. The LLM is used only for the three remaining semantic subagents: `lite-drafter`, `finding-extractor`, `metadata-extractor`. Per-item LLM loops are forbidden β if you find yourself running an agent N times for N items, stop and shell out to a script.
## Step 0 β greet the user
Print exactly:
> π Ingesting .../kb-literature-reviewRuns Obsidian literature workflow on Sources/Papers, producing Knowledge/Literature Overview.md, Method Taxonomy.md, Research Gaps.md, Writing/related-work-draft.md, and Maps/literature.canvas.
/gap-analysisAnalyzes papers in a folder or Zotero collection for research gaps and generates new research ideas.
/generate-research-digestProcesses queued research PDFs: generates summaries (splits large ones into sections via subagents), creates today's digest file with paper links and summaries.
/ingestIngests URLs, file paths, freeform text, or inbox files into a wiki. Auto-detects type; handles tweets via Grok, GitHub repos, articles/papers; supports custom titles, types, wiki selection.
/ingestIngests a source (paper, article, transcript, PDF, notes) into the LLM Wiki: reads/chunks content, discusses takeaways, updates pages, handles ontology graph, logs changes, reports actions.
/ingestSummarizes current conversation (user request, solution, decisions, insights) and ingests it into CORE memory using memory_ingest tool. Optionally associates with a space and confirms success.
Share bugs, ideas, or general feedback.
Fast, triage-grade ingest. $ARGUMENTS is the paper reference.
Division of labor: the deterministic steps (fetch, parse, template fill, edge aggregation, logging, stub creation, citation matching) are done by Python scripts in ${CLAUDE_PLUGIN_ROOT}/scripts/. The LLM is used only for the three remaining semantic subagents: lite-drafter, finding-extractor, metadata-extractor. Per-item LLM loops are forbidden β if you find yourself running an agent N times for N items, stop and shell out to a script.
Print exactly:
π Ingesting your paper β this will take a moment. Sit back, get cozy, and maybe grab a coffee βοΈ
Shell out. The script validates the vault, classifies the input, caches the raw file, and produces full + brief text:
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/fetch_paper.py" "<vault-path>" "$ARGUMENTS"
Parse the JSON result.
Early exit β duplicate paper. If the result has "already_exists": true, the paper is already in the vault (matched by arxiv-id, doi, or source-url). Do not run any further steps. Print a short message naming the existing slug, e.g.:
βοΈ This paper is already in your vault as
papers/<existing.slug>.mdβ skipping ingest.
Then stop.
Otherwise, keep full_text_path, brief_text_path, findings_text_path, meta_text_path, source_url, arxiv_id, doi for later steps.
Run these in parallel (they're independent reads):
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/vault_scan.py" fields "<vault-path>"
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/vault_scan.py" papers "<vault-path>"
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/vault_scan.py" authors "<vault-path>"
Hold the outputs: existing_fields, vault_papers, existing_authors.
Launch in one parallel message:
| Agent | Model | Input | Purpose |
|---|---|---|---|
lite-drafter | model_reasoning | brief_text_path | returns the 4 sections JSON |
finding-extractor | model_normal | findings_text_path | returns atomic findings JSON. Fed the abstract + intro + method + results + conclusion slice, not the full paper β saves tokens while keeping theoretical / empirical / definitional claims reachable. |
Do not spawn a citation-linker agent β bibliographic matching is deterministic and runs in step 6 via citation_match.py.
Once lite-drafter returns, spawn metadata-extractor (model_normal) with:
paper_text_path = meta_text_path (first 2 pages β enough for title/authors/date/venue/quality)summary_text = the concatenated markdown returned by lite-drafter (for fields)existing_fields = list from step 2source_url, arxiv_id, doi = from step 1The agent returns metadata JSON. It does NOT compute quality.overall or the slug β the assembly script does both.
Before writing any /tmp/*.json payload in this step or step 6/7, first clear stale files from prior runs in a single Bash call:
rm -f /tmp/paper_payload.json /tmp/findings_payload.json /tmp/stubs_payload.json /tmp/edges_payload.json
Without this, the Write tool refuses to overwrite a /tmp/*.json file it has not Read in the current conversation, and the ingest stalls.
Write the payload to /tmp/paper_payload.json with this exact shape (note metadata is a nested key β flat layouts will fail with KeyError: 'metadata'):
{
"vault_path": "<vault-path>",
"source_url": "<source_url from step 1>",
"metadata": {
"title": "...",
"authors": ["Surname, Given", "..."],
"publication-date": "YYYY-MM-DD",
"venue": "...",
"fields": ["nlp", "..."],
"arxiv-id": "..." ,
"doi": null,
"quality": {
"credibility": 5,
"experimental-rigor": 5,
"reproducibility": "code-released",
"rationale": "..."
}
},
"sections": {
"key_takeaways": "...",
"background": "...",
"main_idea_and_summary": "...",
"critique": "..."
},
"findings": [],
"relations": {}
}
Then pipe it in:
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/assemble_paper.py" --input /tmp/paper_payload.json
The script computes quality.overall, generates slug if absent, fills templates/paper-lite.md, and writes <vault>/papers/<slug>.md. It refuses to overwrite unless overwrite: true is set in the payload β ask the user first.
Capture the returned slug.
/tmp/findings_payload.json shape (note source_paper, not paper_slug):
{
"vault_path": "<vault-path>",
"source_paper": "<slug from step 5>",
"fields": ["nlp", "..."],
"findings": [ { "statement": "...", "source-ref": "...", "finding-type": "empirical", "hedging": "asserted", "quote": "..." } ]
}
/tmp/stubs_payload.json shape:
{ "vault_path": "<vault-path>", "authors": ["Surname, Given", "..."], "fields": ["nlp", "..."] }
Launch all four at once β they're independent. 6c uses --exclude-paper <slug> to keep the just-written findings (from 6a) out of the candidate set, so ordering between 6a and 6c doesn't matter.
# 6a. Write finding files in one script call.
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/assemble_finding.py" --input /tmp/findings_payload.json
# 6b. Deterministic citation matching. Feed vault_papers from step 2.
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/citation_match.py" \
"<full_text_path>" <(echo "$VAULT_PAPERS_JSON") \
--own-slug "<slug>"
# 6c. Candidate finding shortlist for finding-linker.
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/vault_scan.py" findings-candidates "<vault-path>" \
--fields <metadata.fields joined by ,> \
--authors "<metadata.authors joined by ;>" \
--exclude-paper "<slug>" \
--cap 30
# 6d. Missing author/field stubs.
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/create_stubs.py" --input /tmp/stubs_payload.json
Now update the paper page's findings: frontmatter with the new slugs:
# Re-run assemble_paper.py with overwrite:true and findings: [...] populated.
(Or, equivalently, edit just the frontmatter via a targeted Edit on <vault>/papers/<slug>.md.)
Spawn finding-linker (model_normal) once, with:
new_findings = the slugs + statements + fields written in 6acandidate_existing_findings = output of 6cIt returns typed-edge proposals. Write /tmp/edges_payload.json with this exact shape (note the keys are new_paper and linker_output, not source_paper/edges):
{
"vault_path": "<vault-path>",
"new_paper": "<slug from step 5>",
"linker_output": [ { "from": "<new-finding-slug>", "to": "<existing-finding-slug>", "type": "supports|contradicts|extends|uses|similar-to", "rationale": "..." } ],
"cites": ["<paper-slug>", "..."]
}
If finding-linker returned zero edges (e.g. empty candidate set), still call the script with "linker_output": [] so paper-level cites from 6b get merged in.
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/apply_edges.py" --input /tmp/edges_payload.json
The script:
relations.*,contradicts / similar-to onto target findings,usesβbuilds-on, supports, extends, contradicts, similar-to),cites from 6b into the new paper's relations.cites."${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/log.py" "<vault-path>" ingest-lite "<slug>" \
"<n> findings, <e> edges"
Invoke lint scoped to the findings just written, so the dedup check only considers the new set against the existing vault (not all-pairs):
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python3" "${CLAUDE_PLUGIN_ROOT}/scripts/lint.py" "<vault-path>" \
--new-slugs "<finding-slug-1>,<finding-slug-2>,..."
--link-similar is on by default β any near-duplicate pairs found get bidirectional similar-to edges automatically. Display the report inline.
If ${CLAUDE_PLUGIN_CONFIG:open_in_obsidian} is truthy, print (don't auto-run):
open "obsidian://open?vault=<vault-basename>&file=papers/<slug>"
End-of-run summary:
quality.overall/tmp/*.json payloads at the start of step 5 (see the rm -f line). The Write tool will not overwrite a file it has not Read in the current conversation, so leftover files from a prior ingest run will block the pipeline.papers/<slug>.md without asking.finding-extractor returns zero findings, warn the user β the paper may have been abstract-only.