From scholar-mcp
Use when the user asks you to search, retrieve, or reason about scholarly sources — papers, patents, books, or standards — guides tool selection, chaining, and cross-domain workflows.
npx claudepluginhub pvliesdonk/scholar-mcpThis skill uses the workspace's default tool permissions.
When the user asks about academic literature, prior art, citations, patents,
Conducts systematic literature reviews across PubMed, arXiv, bioRxiv, Semantic Scholar; synthesizes findings into markdown/PDFs with verified citations (APA, Vancouver). For meta-analyses, research synthesis.
Searches academic literature via arXiv, Semantic Scholar, and open-access sources. Fetches and parses PDFs for abstracts, key findings, methodology, and citations. Use for research, literature reviews, or formal citations.
Synthesizes existing knowledge on topics, identifies research gaps, and traces evolution of scientific ideas via systematic literature reviews using academic databases.
Share bugs, ideas, or general feedback.
When the user asks about academic literature, prior art, citations, patents, books, or standards, use the scholar-mcp tools as described below. The server covers four source domains: papers, patents, books, and standards. Each domain has its own tools, but they cross-reference naturally — a paper may cite a patent, a patent's NPL section may reference a paper, and a paper with an ISBN is automatically enriched with book metadata.
| User intent | Tool | Key parameters |
|---|---|---|
| Find papers on a topic | search_papers | query, year_start/year_end, fields_of_study, venue, min_citations, sort |
| Look up a specific paper | get_paper | identifier — DOI, S2 ID, ARXIV:id, ACM:id, PMID:id |
| Find an author's work | get_author | Numeric S2 author ID for profile, or name string for disambiguation |
| Find patents | search_patents | query, cpc_classification, applicant, inventor, jurisdiction, date filters |
| Look up a specific patent | get_patent | patent_number, sections (biblio, claims, description, family, legal, citations) |
| Find books | search_books | Prefer title/author over query — dedicated indexes give better recall |
| Look up a book by ISBN | get_book | identifier — ISBN-10, ISBN-13, OL work ID, or edition ID |
| Find a standard | search_standards | query, optional body filter (NIST, IETF, W3C, ETSI) |
| Normalise a messy citation | resolve_standard_identifier | raw — e.g. "rfc9000", "nist 800-53", "wcag 2.2" |
| Resolve a mixed list of IDs | batch_resolve | identifiers — mix of DOIs, S2 IDs, patent numbers, ISBN: prefixed ISBNs |
Domain-sticky traversal: a paper's references mostly point to other papers; a patent's family members are other patents; a standard often cites other standards. Follow the domain naturally — don't switch tools mid-chain unless the data leads you there (e.g. a patent's NPL citations resolve to papers).
search_papers with a broad query. Use sort="citations" and
min_citations to surface influential work quickly.get_citations to see who built on it.get_references on the same paper to see what it builds on.fields_of_study to split searches
(e.g. search once for "Computer Science", once for "Medicine").get_paper. Don't search first.get_author with a name returns up to 5
disambiguation candidates. Ask the user to pick before fetching
publications — don't guess.search_patents with keywords, CPC codes, or applicant names.get_patent with sections=["biblio", "citations"] — the citations
section includes NPL references automatically resolved to S2 papers.get_citing_patents finds patents that cite a given paper (coverage is
incomplete — EPO OPS does not capture all citations).resolve_standard_identifier handles messy input ("rfc 9000",
"NIST SP 800-53 rev5") and normalises to canonical form. Use it first when
the user gives an informal name.get_standard with fetch_full_text=true downloads and converts the
standard via docling (requires SCHOLAR_MCP_DOCLING_URL). This may block
until conversion completes; it only returns a task ID if rate-limited.get_citations — papers that cite a given paper (forward). Supports year,
field, and minimum citation count filters.get_references — papers cited by a given paper (backward).offset for additional pages.get_citation_graph does breadth-first expansion from 1–10 seed papers.
depth 1 is usually sufficient for a neighbourhood view. Depth 2–3
expands rapidly — always set max_nodes to cap growth.direction="both" finds the densest cluster but doubles API calls per hop.
Use "citations" or "references" when the user has a clear direction.year_start/year_end and min_citations to keep results
focused. When filters are active, the tool fetches more candidates per node
to compensate for filtering losses.find_bridge_papers finds the shortest citation path between two papers.
Useful for connecting seemingly unrelated work. Searches up to max_depth
hops (default 4). Use direction="both" for best coverage.
recommend_papers takes 1–5 positive paper IDs and optional negative IDs.
The positive examples define the topic; negatives steer away from unwanted
areas. For best results, pick positive examples that span the desired topic
rather than clustering around one sub-area.
recommend_books takes a subject string (e.g. "machine learning", "algorithms")
and returns popular books from Open Library sorted by edition count.
generate_citations produces BibTeX, CSL-JSON, or RIS for up to 100 papers.
enrich=true (the default) adds OpenAlex venue metadata for more complete
BibTeX entries (journal names, volumes, pages).externalIds
gets a book_metadata field (publisher, edition, cover URL, subjects)
from Open Library. No extra tool call needed.enrich_paper adds OpenAlex fields (affiliations, funders, OA status,
concepts) on demand. Useful when the user needs institutional or funding
context beyond what S2 provides.PDF tools are write-tagged and hidden in read-only mode (the default). The
user must set SCHOLAR_MCP_READ_ONLY=false and have a running docling-serve
instance (SCHOLAR_MCP_DOCLING_URL) for Markdown conversion.
fetch_paper_pdf downloads the PDF with automatic fallback: S2 open-access
→ ArXiv → PubMed Central → Unpaywall.fetch_and_convert does fetch + convert in one call — usually what users
want.fetch_pdf_by_url handles arbitrary PDF URLs (e.g. an author's website).convert_pdf_to_markdown converts a local PDF file to Markdown — use this
when you already have the PDF on disk (e.g. a paywalled paper obtained
manually). Accepts an absolute file_path.fetch_patent_pdf downloads via authenticated EPO OPS. Not all patents
have PDFs available (older patents, some WO publications).get_standard with fetch_full_text=true fetches and converts in one call.Pass use_vlm=true to fetch_and_convert, fetch_pdf_by_url,
fetch_patent_pdf, or convert_pdf_to_markdown for better formula and figure
extraction. (get_standard uses docling internally but does not expose the VLM
flag.) Requires SCHOLAR_MCP_VLM_API_URL and SCHOLAR_MCP_VLM_API_KEY.
VLM and standard conversions are cached separately — switching modes never
overwrites previous results.
PDF downloads (fetch_paper_pdf), patent PDF fetches (fetch_patent_pdf),
full pipeline runs (fetch_and_convert), URL-based fetches (fetch_pdf_by_url),
local file conversions (convert_pdf_to_markdown), and any tool that hits a
rate limit return a task ID when work is submitted to the background queue.
Cache hits return immediately without queuing. Example queued response:
{"queued": true, "task_id": "abc123", "tool": "..."}
Poll with get_task_result(task_id="abc123"). The response includes
status, elapsed_seconds, and a hint while running.
list_tasks shows all active tasks. Don't poll in a tight loop — tasks
typically complete in 10–30 seconds (PDF download) or 1–5 minutes (PDF
conversion with VLM).
search_papers when the user gives a specific identifier —
use get_paper directly.get_author with a name and then silently pick the first
candidate — ask the user to disambiguate.depth > 2 in get_citation_graph without also setting a
tight max_nodes cap — the graph expands exponentially.fetch_pdf_by_url for EPO OPS patent URLs — they require
authentication. Use fetch_patent_pdf instead.batch_resolve when the user gives a mixed list of
identifiers — it routes each to the correct backend automatically.