Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Claude Code plugins tagged for PyTorch development. Browse commands, agents, skills, and more.
Manage the full Hugging Face ML lifecycle from a single agent: search and select models, estimate GPU memory, train or fine-tune with TRL/Unsloth, evaluate locally, build and deploy Gradio demos on Spaces, publish datasets and research papers, and run models in-browser with Transformers.js.
Look up Python code examples and enforce Pythonic style — fetch syntax, concurrency, ML, and HPC references from pythonsheets.com while writing, debugging, or optimizing code, and get linting guidance for readable, idiomatic Python.
Work with Mooncake Python APIs to perform distributed storage operations, RDMA/TCP data transfers, and PyTorch tensor processing.
Autonomously optimize LLM serving infrastructure — profile torch traces, benchmark SGLang/vLLM/TensorRT-LLM, simulate capacity and compute, and run RLCR loops that patch code to match or beat competitor performance. Also includes human-like PR review and incident triage for production serving.
Build, train, and deploy AI models on Amazon SageMaker — validate datasets, select fine-tuning techniques, run SFT/DPO/RLVR training, diagnose HyperPod cluster issues (NCCL, GPU, Slurm), and deploy to endpoints or Bedrock, all from your coding assistant.
Run Claude Code workflows across software lifecycle phases: architect, review, test, deploy, and document with 70+ modular skills for code quality, security auditing, automated releases, manuscript preparation, system diagrams, and agent orchestration.
Provides 197 computational skills for scientific AI agents to perform life sciences research, covering genomics, proteomics, drug discovery, medical imaging, biostatistics, and scientific writing via integrations with databases, analysis tools, and ML frameworks.
Author, optimize, and deploy PyTorch models for on-device execution on Apple silicon using Core AI. Covers op compatibility rules, weight quantization/palettization for accuracy-size tradeoffs, and the full export-compile-run pipeline on Neural Engine and GPU.
Enforces feature-layer architecture for Claude Code projects with battle-tested skills: scaffold bounded layers and feature narratives, run multi-agent swarm reviews, freeze acceptance criteria, and generate pixel art. Includes iOS development, Remotion video production, and humanization tools for AI-generated text.
Migrates AI models and custom operators to Huawei Ascend NPUs, handles GPU-to-NPU code adaptation, profiling, performance optimization, distributed training deployment, and provides developer tooling for vLLM inference, Ascend C/Triton kernel development, and environment diagnostics.
Automates the full academic research workflow: literature search, data processing, idea validation, experimental design, paper drafting and polishing, publication-ready figure generation, peer review simulation, rebuttal crafting, compliance auditing, patent/software registration, and presentation creation, with integrity checks and multi-format typesetting.
Draft and analyze patent applications for USPTO, EPO, and PCT with automated prior art search via BigQuery (100M+ patents), compliance checks against MPEP/EPC rules, and generation of patent-style diagrams. Includes agents for autonomous drafting, claims analysis, and patentability assessment.
Run LLM post-training workflows including SFT, OSFT, LoRA fine-tuning, and GRPO reinforcement learning through a unified interface with automatic GPU memory estimation and environment setup.
Automate end-to-end academic research: write scientific papers with LaTeX/Markdown, search and cite literature, analyze data with Python libraries (pandas, PyTorch), run bioinformatics pipelines, generate publication-quality figures, create posters and presentations, and manage citations and references. Includes tools for grant writing, peer review, and clinical decision support.
Call NVIDIA BioNeMo NIM agents via API or local Docker to automate life science workflows: predict protein structures, dock small molecules, run generative chemistry, design proteins, and analyze genomics—with support for cluster job management, custom pretraining, and fine-tuning.
Automate end-to-end ML performance investigations: research SOTA papers and architectures, generate phased plans, judge experimental methodologies, profile bottlenecks, run metric-improvement campaigns with atomic git commits, auto-rollback on regressions, and leverage specialist agents for data lifecycle and deep paper analysis.
Orchestrate an AI-powered academic research workflow inside Claude Code: manage paper drafting, iterative revision, figure creation, literature search with citation verification, reviewer rebuttal assembly, and project state tracking through specialized agents and automated pipelines.
Enforce a rigorous empirical research pipeline for ML/AI claims: extract competitor baselines, preregister hypotheses, run adversarial falsification, execute locked experiments with statistical checks, and force kill-or-ship decisions based on repository evidence. Integrates with Hugging Face Hub for model training, evaluation, dataset inspection, and paper retrieval.
Guardrail your AI/ML research workflow with an AI collaborator that searches literature using query variations, analyzes codebases and logs, designs minimal falsification experiments, records predictions, and audits bugs.
Automate multi-chip GPU AI inference workflows on the FlagOS platform: kernel generation and review, model migration from upstream vLLM, containerized stack installation and environment verification, and end-to-end performance benchmarking across NVIDIA, AMD, Ascend, and other hardware backends.
Accelerate GPU kernel development with an integrated workflow: query a knowledge base of CUDA, Triton, and CUTLASS patterns, benchmark custom kernels against PyTorch baselines, profile with Nsight Compute, and run iterative optimization loops with correctness checks
Bootstrap Claude Code with 17 specialized agents, skills, and hooks to audit/evolve .claude/ configs, engineer/refactor Python code via TDD, profile/optimize ML workloads, generate docs/tests, design systems, diagnose issues, and manage workflows professionally.
Installs LVSA and generates long videos with block-sparse attention, automatically selecting SDPA vs FlashInfer backend and configuring reference latent frames per model while verifying sparse path engagement.
Extend video diffusion models with LVSA (Long Video Sparse Attention) support by implementing a ModelAdapter for geometry, QKV extraction, RoPE, and output projection in single-stream, dual-stream, or joint-attention DiTs.
Diagnose NVIDIA LongVidio Sparse Attention (LVSA) failures: identify silent dense fallback, out-of-memory at long sequences, missing MP4 outputs in Docker, quality regressions from training references, and environment variable misconfigurations.
Delegate expert-level AI/ML workflows to specialized agents: engineer optimized prompts with evaluation and A/B testing, architect scalable LLM systems with RAG/LoRA fine-tuning, build production NLP pipelines for NER/classification/QA, and deploy optimized models via vLLM/Triton/Docker/K8s for reliability, performance, and cost control.
Run an autonomous optimize-measure-keep/discard cycle on any optimization target: LLM training loss, test speed, bundle size, build time, and more.
Prefix terminal commands with 'gpu' to run ML training, LLM inference, ComfyUI workflows, and media processing on remote NVIDIA GPUs (A100, H100, RTX 4090) from your Mac. Automatically provisions pods, syncs files bidirectionally, streams logs, debugs interactively, selects optimal GPUs, and optimizes costs.
Train, evaluate, export, and deploy NVIDIA TAO computer vision models — covering classification, detection, segmentation, pose estimation, depth, video understanding, and 3D perception — with integrated AutoML, DEFT iterative improvement loops, synthetic data generation, and multi-platform GPU job submission (Docker, Kubernetes, SLURM, Brev, DGX Cloud).
Streamline end-to-end data science and ML workflows: frame business problems into ML tasks, preprocess and validate data with quality checks, perform EDA on diverse formats, design and execute experiments with hyperparameter tuning via Optuna and interpretability via SHAP, audit reproducibility and leakage, evaluate model performance and readiness for deployment, generate model cards, and extract structured learnings into docs.
Turn Claude Code into a professional media production workstation: transcode, stream, package, QC, color-grade, and deliver video/audio across broadcast, OTT, and AI-enhanced pipelines using FFmpeg, OBS, GStreamer, WebRTC, and 90+ open-source media tools.
Equip AI agents with 9 engineering skills to architect scalable backends and distributed systems, secure apps and pipelines, prototype MVPs, build mobile and ML apps, guide frontend development, automate DevOps infrastructure, and plan senior-level software delivery.
Train and run inference on machine learning models using Hugging Face Transformers and PEFT with PyTorch on cloud GPUs from Modal, Lambda Labs, or RunPod—no local GPU required.
Trace PyTorch operator implementations across Python, C++, and CUDA layers, analyze nn.Module-to-native binding chains, map code changes to affected tests, and query dispatch mechanisms.
Apply 97 structured reasoning patterns from history's greatest thinkers to any problem — debug, design, research, or write — using specialized agents that analyze, critique, and synthesize across domains.
Write idiomatic MLX code for machine learning on Apple Silicon, implementing arrays, neural networks, training loops, lazy evaluation, unified memory, Metal GPU acceleration, and PyTorch migrations.