Help us improve
Share bugs, ideas, or general feedback.
From opencite
This skill should be used when the user asks to "search for papers", "find citations", "look up a DOI", "get BibTeX", "download PDF", "convert PDF to markdown", "find canonical papers", "convert identifiers", "batch download papers", "configure opencite", or mentions opencite, academic literature search, citation management, or paper retrieval.
npx claudepluginhub neuromechanist/opencite --plugin openciteHow this skill is triggered — by the user, by Claude, or both
Slash command
/opencite:openciteThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
OpenCite is a CLI tool and Python library for academic literature search and citation management. It aggregates results from Semantic Scholar, OpenAlex, PubMed, arXiv, and bioRxiv/medRxiv, deduplicates them, and outputs formatted results. It also supports PDF retrieval, PDF-to-markdown conversion (included by default), and batch operations.
Use this skill for "search for papers", "find citations", "look up a DOI", "get BibTeX", "download PDF", "convert PDF to markdown", "find canonical papers", "convert identifiers", "batch download papers", "configure opencite", "literature review", "find related papers", "what papers cite this", "export references", "read this paper", or mentions of opencite, Semantic Scholar, OpenAlex, PubMed, academic literature search, citation management, or paper retrieval.
Searches academic literature via arXiv, Semantic Scholar, and open-access sources. Fetches and parses PDFs for abstracts, key findings, methodology, and citations. Use for research, literature reviews, or formal citations.
Manages academic citations by searching Google Scholar and PubMed, extracting metadata, validating references, and generating BibTeX.
Share bugs, ideas, or general feedback.
OpenCite is a CLI tool and Python library for academic literature search and citation management. It aggregates results from Semantic Scholar, OpenAlex, PubMed, arXiv, and bioRxiv/medRxiv, deduplicates them, and outputs formatted results. It also supports PDF retrieval, PDF-to-markdown conversion (included by default), and batch operations.
# Option 1: uv (recommended)
uv pip install opencite
# Option 2: pip
pip install opencite
# Option 3: uvx (no install needed, runs from cache)
uvx opencite --version
PDF conversion support (markitdown and markit-mistral) is included by default. If MISTRAL_API_KEY is set, markit-mistral is used for better handling of math, tables, and complex layouts. Otherwise, markitdown (free, local) is used as the fallback.
For development:
uv sync --extra dev # install from source with dev tools
opencite supports TOML config, .env files, and environment variables.
uvx opencite config init # creates ~/.opencite/config.toml template
uvx opencite config show # display resolved config (keys masked)
uvx opencite config path # show config file location
Config loading priority (later overrides earlier):
~/.opencite/config.toml~/.opencite/.env.env in working directorySEMANTIC_SCHOLAR_API_KEY - Semantic Scholar APIPUBMED_API_KEY - NCBI/PubMed APIOPENALEX_API_KEY - OpenAlex API (required since Feb 2026)MISTRAL_API_KEY - (optional) Mistral AI for enhanced PDF-to-markdown conversionELSEVIER_API_KEY - Elsevier/ScienceDirectWILEY_TDM_TOKEN - Wiley TDMSPRINGER_API_KEY - Springer NatureWhen the user asks for literature research, paper retrieval, or reading material on a topic, follow this end-to-end workflow.
Choose the search strategy based on user needs:
uvx opencite canonical "topic" --max 10uvx opencite search "query" --max 20 --sort citationsuvx opencite cite "DOI" --direction bothReview results considering citation count, relevance, recency, and open access availability. Present a summary to the user and confirm which papers to retrieve.
For multiple papers (preferred): Save search results as JSON and use batch-fetch:
uvx opencite search "topic" --max 10 -f json -o results.json
uvx opencite batch-fetch --from-json results.json --convert -o ./papers --summary report.json
batch-fetch --convert automatically creates this directory structure:
papers/
├── pdf/ # downloaded PDFs
└── markdown/ # converted markdown files
└── img/ # per-paper image directories (mistral only)
For individual papers:
uvx opencite pdf "10.1234/example" -o papers/pdf/ --convert
Note: pdf --convert places the markdown file next to the PDF and does not extract images. For the organized subdirectory layout with image extraction, use batch-fetch.
Read the converted markdown files for deeper analysis:
batch-fetch, markdown is in papers/markdown/ and extracted images (markit-mistral only) are in per-paper subdirectories under papers/markdown/img/uvx opencite lookup "DOI" -f bibtex --append-bib refs.bibuvx opencite search "query string" [options]
Options:
--max N - Max results (default: 20)--source all|openalex|s2|pubmed|arxiv|biorxiv - Which API to query (default: all)--year-from YYYY - Published after year--year-to YYYY - Published before year--oa-only - Open access only--sort relevance|citations|year - Sort order (default: relevance)-f, --format text|json|bibtex|csv - Output format-o, --output FILE - Write to file-v, --verbose - Show abstractsuvx opencite lookup IDENTIFIER [IDENTIFIER ...] [options]
Accepts DOI, pmid:X, pmc:X, arxiv:X, S2 ID, or OpenAlex ID. Auto-detects the type. Supports multiple IDs.
Options:
-f, --format text|json|bibtex-o, --output FILE--enrich - Fetch from all APIs for richer data--append-bib FILE - Append BibTeX to a .bib file-v, --verboseuvx opencite cite IDENTIFIER [options]
Options:
--direction citing|references|both - Direction (default: citing)--max N - Max papers (default: 50)--sort citations|year - Sort order (default: citations)--min-citations N - Minimum citation count filter-f, --format text|json|bibtex-o, --output FILE-v, --verboseuvx opencite canonical "topic" [options]
Finds the most-cited, foundational papers for a topic.
Options:
--max N - Number of papers (default: 10)--year-from YYYY - Published after year--min-citations N - Minimum citations (default: 100)-f, --format text|json|bibtex-o, --output FILE-v, --verboseuvx opencite pdf IDENTIFIER [options]
Tries multiple sources in priority order: publisher APIs (if tokens configured), OpenAlex/S2 PDF locations, PMC Open Access, direct arXiv/bioRxiv URL, DOI content negotiation.
Options:
-o, --output PATH - Output file path (.pdf) or directory (default: .)--filename NAME - Custom filename--convert - Also convert downloaded PDF to markdown--converter auto|markitdown|mistral - Converter for markdown (default: auto)uvx opencite convert FILE.pdf [options]
Auto mode uses markit-mistral when MISTRAL_API_KEY is set (better for math and complex layouts), otherwise falls back to markitdown (free, local). Both converters are included by default.
Options:
-o, --output FILE - Output markdown path--converter auto|markitdown|mistral - Conversion method (default: auto)--extract-images - Extract images from PDF (mistral only)--images-dir DIR - Directory for extracted imagesuvx opencite ids IDENTIFIER [IDENTIFIER ...] [options]
Converts between DOI, PMID, and PMCID using the NCBI ID Converter API.
Options:
-f, --format text|jsonuvx opencite batch-fetch FILE [options]
uvx opencite batch-fetch --from-json FILE [options]
uvx opencite batch-fetch --from-stdin [options]
Downloads PDFs for multiple papers with controlled concurrency. When --convert is used, output is organized into subdirectories:
output-dir/
├── pdf/ # downloaded PDFs
└── markdown/ # converted markdown files
└── img/ # per-paper image directories (mistral only)
Input sources (mutually exclusive):
FILE - Text file with IDs, one per line--from-json FILE - JSON file (array of DOIs or opencite search results)--from-stdin - Read IDs from stdin (pipe-friendly)Options:
-o, --output-dir DIR - Output directory (default: ./papers)--convert - Also convert each PDF to markdown--converter auto|markitdown|mistral - Converter (default: auto)--concurrency N - Max concurrent downloads (default: 3)--summary FILE - Write JSON summary report to fileuvx opencite config init # create ~/.opencite/config.toml template
uvx opencite config show # display resolved config (keys masked)
uvx opencite config path # show config file location
# Search broadly
uvx opencite search "motor cortex oscillations" --max 20 -f json -o results.json
# Export BibTeX for citation manager
uvx opencite search "motor cortex oscillations" --max 20 -f bibtex -o refs.bib
# Look up the paper
uvx opencite lookup "10.1038/s41586-024-07487-w" -v
# Get papers that cite it
uvx opencite cite "10.1038/s41586-024-07487-w" --direction citing --max 20
# Get its references
uvx opencite cite "10.1038/s41586-024-07487-w" --direction references --max 20
# 1. Find canonical papers in the field
uvx opencite canonical "attention mechanism" --max 5 -f json -o results.json
# 2. Batch download and convert all found papers
uvx opencite batch-fetch --from-json results.json --convert -o ./papers --summary report.json
# 3. Papers are now organized in:
# papers/pdf/ - PDF files
# papers/markdown/ - Markdown files ready for reading
# papers/markdown/img/<paper>/ - Extracted figures (mistral only)
# Single ID
uvx opencite ids "10.1001/jama.2024.12345"
# Multiple IDs with JSON output
uvx opencite ids "10.1001/jama.2024.12345" "PMC7654321" -f json
--source.For programmatic use:
from opencite import Config, Paper, SearchResult
from opencite.search import SearchOrchestrator
config = Config.from_env()
async with SearchOrchestrator(config) as searcher:
results = await searcher.search("query", max_results=10)