Help us improve
Share bugs, ideas, or general feedback.
From medsci-presentation
Optimizes medical AI papers for AI search engines and RAG tools. Provides pass/fail checklists for titles, abstracts, reporting compliance, and preprint/code release visibility.
npx claudepluginhub aperivue/medsci-skills --plugin medsci-literatureHow this skill is triggered — by the user, by Claude, or both
Slash command
/medsci-presentation:academic-aioinheritThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping a medical-AI researcher optimize a paper, preprint, README, or code release so that it is surfaced and cited accurately by AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace), RAG-based literature tools, and traditional scholarly indexes (Semantic Scholar, Google Scholar, PubMed). Your output is a visible pass/fail checklist with concrete edit suggestions, n...
references/case_studies/kjr_mllm_2025.mdreferences/checklists/AIO_GENERAL.mdreferences/journal_summarybox_templates.yamlreferences/oac_funding_checklist.yamlreferences/reporting_guideline_mapping.mdreferences/schema_markup_templates/CodeRepository.jsonldreferences/schema_markup_templates/Dataset.jsonldreferences/schema_markup_templates/Person.jsonldreferences/schema_markup_templates/README.mdreferences/schema_markup_templates/ScholarlyArticle.jsonldscripts/batch_metadata_audit.pyscripts/validate_schema.pyskill.ymltemplates/aio_audit_checklist.md.j2Guides writing ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Includes LaTeX templates, citation verification, and literature review workflows.
Publishes and manages research papers on Hugging Face Hub, including arXiv indexing, model/dataset linking, authorship claiming, and markdown-based article generation.
Publishes and manages research papers on Hugging Face Hub with arXiv integration, model/dataset linking, authorship claims, and professional markdown article templates.
Share bugs, ideas, or general feedback.
You are helping a medical-AI researcher optimize a paper, preprint, README, or code release so that it is surfaced and cited accurately by AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace), RAG-based literature tools, and traditional scholarly indexes (Semantic Scholar, Google Scholar, PubMed). Your output is a visible pass/fail checklist with concrete edit suggestions, not silent rewrites.
[VERIFY].Run this skill when the user is working on any of:
CITATION.cff, Zenodo archive metadata, Hugging Face model card, or dataset card.Pairs with (do not duplicate):
write-paper — Phase 6 (draft) and Phase 7 (QC). AIO rules extend the title/abstract/discussion sections.check-reporting — reporting-guideline item audit (TRIPOD+AI, CLAIM, etc.). AIO requires guideline adherence but does not reproduce the audit.self-review — adversarial review. Run AIO after self-review so QC-confirmed claims anchor the checklist.humanize — AI-pattern removal. Run humanize before AIO so the final text is both human-readable and AI-extractable.Generative engine optimization research (Aggarwal 2024, arXiv:2311.09735) shows that content structured for LLM extraction receives up to 40 % more visibility in generative engines. In medicine this effect is mediated by three gates:
LLM citation fabrication is the dominant failure mode to defend against. Agarwal et al. (Nat Commun 2025, doi:10.1038/s41467-025-58551-6) report that 50–90 % of LLM answers in medicine are not fully supported by their cited sources and up to 78–90 % of citations can be fabricated. The defensive strategy is to surface a paper's DOI and PMID in easy-to-copy form so that LLMs substitute the correct identifier instead of confabulating one.
Structure: [Task] + [Modality or anatomy] + [Model family or method class]. Include one concrete differentiator (dataset scale, new benchmark, "first …") when defensible. Avoid keyword stuffing (penalized as spam by AI overviews).
Examples:
Use the journal-required structure (Background / Methods / Findings / Interpretation for Lancet family; Background / Purpose / Materials and Methods / Results / Conclusion for RSNA family; etc.). If the journal allows unstructured, still use an internally structured form. Each section stands alone as a semantic chunk of ≤ 3 sentences so that chunk-boundary splits in RAG indexes do not break the claim.
Include one sentence that names the field's controlled vocabulary (for example, "diagnostic-accuracy study", "foundation-model evaluation", "LLM-as-judge", "agentic radiology workflow"). Entity linkers in AI indexes use this line.
Every abstract must contain at least one numeric primary outcome with confidence interval (for example, "AUC 0.94 [95 % CI 0.91–0.96]" or "sensitivity 88.2 % [95 % CI 85.1–91.0]"). LLM retrievers weight papers with concrete numbers.
Place the guideline name in the abstract or the opening sentence of Methods: "Reported following TRIPOD+AI (Collins 2024) and CLAIM 2024 (Tejani 2024)". When applicable add STARD-AI 2025, DECIDE-AI, TRIPOD-LLM. This signals structure to LLMs and satisfies reviewer checklists.
AIO-rule ↔ guideline-item mapping: references/reporting_guideline_mapping.md.
Title, abstract, and keywords together should cover ≥ 3× the surface area of the concept — no redundancy. Include:
Royal Society 2024 (doi:10.1098/rspb.2024.1222) reports that 92 % of papers waste keyword real estate by repeating title terms in abstract and keywords; avoid this.
Include the journal-specific summary box verbatim when supported:
These boxes are the fragments Perplexity and ChatGPT web most often copy or paraphrase verbatim; treat them as the paper's canonical citation surface.
Journal-specific templates (USER MUST VERIFY against current IFA): references/journal_summarybox_templates.yaml.
Section and subsection headings should state a claim, not a generic label. "Model underperforms on rare-finding subset" beats "Subgroup analysis".
In the Methods and in at least one Results paragraph, compress primary-outcome statistics into a single sentence pattern: "On the internal test set (n = 842), the model achieved AUC 0.94 (95 % CI 0.91–0.96), sensitivity 88.2 % (85.1–91.0), specificity 91.4 % (88.7–93.6), at an operating point of 0.37."
This pattern is the canonical shape LLM extractors parse first.
Include a labeled block (typically end of Methods or a standalone Data/Code Availability section) listing: data availability and license, code availability with DOI, model weights and checkpoints, prompts and configuration files, random seeds, compute environment. This block is disproportionately scraped by AI agents when they cite a paper as reproducible.
List limitations explicitly and name each one (generalizability, spectrum bias, dataset shift, single-center training, label noise). Papers with enumerated limitations score higher for trustworthiness in LLM summarization benchmarks.
Each caption should re-state the claim, the dataset, and the metric. Captions survive in vector databases and image-retrieval indexes when surrounding body text is lost.
Most medical-AI venues allow preprints (Radiology, RYAI, Lancet DH, npj DM, Nature Medicine, JAMIA, JMIR, Cell Reports Medicine, Cell Patterns). A few have restrictions or require disclosure. Always verify the current policy on Sherpa Romeo or the journal's instructions-for-authors page before posting.
Plan launch activities around these windows.
Prefer gold OA with CC-BY when budget allows. If not, green OA via preprint plus author-accepted manuscript is acceptable. Closed-access papers without preprint lose roughly 30–50 % of AI-tool citations because Elicit, Consensus, and Perplexity Academic cannot extract from paywalled PDFs.
Funder OA-policy decision tree (Plan S, NIH, UKRI, Gates, Wellcome, NRF, MoHW): references/oac_funding_checklist.yaml.
Review articles function as hub nodes in knowledge graphs and accrue "lookup citations" when readers need a canonical reference for a taxonomy. For researchers building a portfolio in medical AI:
Empirically, review papers with these properties outperform original research on short-term FWCI while feeding traffic to the authors' original papers through reverse citation.
pip install or git clone && make demo. Should work in under 5 minutes.Add a CITATION.cff file at repository root. GitHub renders it as a "Cite this repository" button, and AI agents treat it as the primary citation hint. Include authors with ORCID, title, version, DOI (post-Zenodo-archive), repository URL, and license.
Enable GitHub–Zenodo integration for each release. Cite the version-specific DOI in the paper's Data/Code Availability section. Zenodo deposits appear in Google Scholar and OpenAlex, creating an independent citable artifact.
Required keys: license, library_name, tags, datasets, base_model (when fine-tuning), pipeline_tag. Required prose sections: Intended use, Training data, Evaluation, Limitations, Ethical considerations, and a clinical-use disclaimer ("This model is not approved for clinical diagnostic use; it is provided for research purposes only").
Required prose: license, PHI and re-identification risk, task, language, splits, annotation process, known biases, ethical review status.
ScholarlyArticle / SoftwareSourceCode / Dataset / Person markup in repository pages and author landing pages — templates in references/schema_markup_templates/, validated with python scripts/validate_schema.py path/to/file.jsonld.Given Agarwal et al. Nat Commun 2025 (doi:10.1038/s41467-025-58551-6) findings that up to 78–90 % of LLM medical citations can be fabricated, take the following defensive steps:
DOI: 10.xxxx/yyyy • PMID: 12345678).When invoked, run in this order:
pre-draft / drafting / pre-submission / post-acceptance / post-publication.applies_to_phase field in references/checklists/AIO_GENERAL.md. Out-of-phase rules become NA rather than FAIL (e.g., do not surface §11.5 multi-disciplinary roster or §12 launch sequencing as FAIL on a pre-submission audit). Produce a PASS / PARTIAL / FAIL table sorted by expected_lift (high → medium → low). Render via templates/aio_audit_checklist.md.j2 when programmatic.defers_to annotations to avoid duplicate audits. Items annotated with a defers_to field record only present/absent status here; item-level detail belongs to the linked skill or reference (§1.6 → /check-reporting; §3.4 / §11.3 → references/oac_funding_checklist.yaml). Cross-check reporting-guideline anchor (§1.6) by invoking /check-reporting first when the manuscript has not been audited; the AIO ↔ guideline-item mapping is in references/reporting_guideline_mapping.md.pre-draft rules — applies_to_phase filter auto-NAs them once drafting is complete. Section 12 launch sequencing fires only at post-acceptance / post-publication.post-acceptance time. For multi-repo or Hugging-Face-card team audits, run scripts/batch_metadata_audit.py.defers_to rule.expected_lift (high first, then medium, then low). Edits whose underlying rule is low-lift should not appear in the Top 5 unless no high / medium items remain open.write-paper## Academic AIO Checklist — [Artifact type]
| # | Item | Status | Note |
|---|------|--------|------|
| 1.1 | Title three-slot | PASS/PARTIAL/FAIL | … |
| 1.2 | Structured abstract | PASS/PARTIAL/FAIL | … |
| ... | ... | ... | ... |
## Top 5 suggested edits
1. …
2. …
Modern RAG indexes parse Q&A blocks more reliably than free-form prose; LLM citation engines preferentially extract claim-restatement pairs. Section 10 augments retrievability by structuring how claims are restated and how entities are linked.
Add a labeled Q&A block — either as the closing subsection of Discussion, or as a Supplementary Box. Pattern:
This block is the canonical fragment that AI-overview systems extract and cite. Lancet Digital Health "Research in context" already encodes the first two questions; the Q&A block extends them and is parseable by LLM web-search agents.
Define each domain-specific acronym inline on first use AND list them in a Glossary subsection at end of Methods or Supplementary. Attach the canonical entity ID where possible:
Entity linkers in Elicit, Consensus, and SciSpace use this metadata to connect a paper to knowledge graphs.
Avoid bare reference numbers. Use semantic anchor patterns so LLM extractors bind the citation to the specific claim:
When citing one's own prior work, name the cohort or dataset explicitly to enable cross-paper retrieval.
Beyond Section 2.5 (limitations enumeration), include a single-paragraph "Why this is hard" challenge statement near the start of Discussion. Pattern:
"Building accurate [task] for [modality/anatomy] is constrained by [data scarcity / label noise / dataset shift / regulatory uncertainty / interpretability]. Each of these has been documented [refs], and our results address [subset]."
LLM web-search systems quote challenge statements as authoritative summaries of field state. The 2025 KJR multimodal-LLM review used this pattern (e.g., "lack of large-scale high-quality multimodal datasets") and was preferentially extracted by Perplexity and ChatGPT web (see references/case_studies/kjr_mllm_2025.md).
Topic timing is the most under-discussed AIO lever. Reviews and original research published at the peak of a topic's hype curve accrue citations disproportionately; reviews that lag the peak by 6–12 months under-perform regardless of quality.
Signals that a topic is approaching peak (write now, publish in ~6 months):
Plan submission so publication lands at peak, not after.
If a corresponding author serves on the target journal's editorial board, review-process median time often drops noticeably (KJR: ~4–6 weeks faster; varies by journal). Editor's-pick or issue-highlight selection can also drive Google News indexing within 24 hours of publication.
When recruiting senior co-authors for a review paper, prefer those who hold an editorial role at the target venue. This is a legitimate editorial signal, not a conflict-of-interest issue, provided board members recuse themselves from review of their own submissions per ICMJE guidance.
Open-access journals that automatically deposit to PubMed Central (PMC) reach LLM crawlers within 4–6 weeks of publication; non-PMC OA journals can take 3–6 months. PMC-auto-deposit journals in radiology/medical-AI (verify per submission, policies change):
When all else is equal, prefer PMC-auto-deposit journals to compress the LLM-discoverability window.
Discussion sections should anchor the paper in 5–10 high-visibility prior works that LLM training corpora already index well. This raises co-citation probability and makes the paper retrievable when users query the seminal works.
Author-affiliation diversity multiplies indexing entry points. A 10–15 author team spanning 3+ institutions and 2+ disciplines (clinical + computational) creates more author-entity nodes in Google Scholar and Semantic Scholar, each acting as a discovery surface. The 2025 KJR MLLM review used a 15-author team spanning resident + engineer + medical student + faculty across 5 institutions and accrued 64 citations within 7 months (see case study).
Section 3.5 (post-acceptance channel checklist) is unordered; Section 12 prescribes the timing. The first 30 days after publication are the primary discoverability window for AI-search engines and LLM training-data harvesters.
[UNVERIFIED - NEEDS MANUAL CHECK].