Plugin

vllm-skills

Name: vllm-skills
Author: vllm-project

Deploy vLLM OpenAI-compatible inference servers locally with hardware detection, via Docker images, or Kubernetes YAML manifests with GPU support, then benchmark throughput, TTFT, TPOT, inter-token latency, and prefix caching using synthetic data, ShareGPT, or fixed prompts.

npx claudepluginhub vllm-project/vllm-skills --plugin vllm-skills

Component Overview

Skills

Component Details

Skills (6)

vllm-bench-random-synthetic

/vllm-bench-random-synthetic

Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and other key performance metrics. Use when the user wants to quickly test vLLM serving performance without downloading external datasets.

vllm-bench-serve

/vllm-bench-serve

Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.

vllm-deploy-docker

/vllm-deploy-docker

Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.

vllm-deploy-k8s

/vllm-deploy-k8s

Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.

vllm-deploy-simple

/vllm-deploy-simple

Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.

vllm-prefix-cache-bench

/vllm-prefix-cache-bench

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

README

vLLM Skills

A collection of skills for deploying and benchmarking vLLM. This project follows the anthropics/skills template format and is installable as a Claude Code plugin marketplace.

Overview

This repository provides modular, reusable agent skills required to operate and benchmark vLLM, following the Anthropics SKILL.md specification. Each skill is a self-contained directory implementing automation, scripts, and metadata for a specific operational task.

Skills Index

Skill	Description
vllm-deploy-docker	Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.
vllm-deploy-k8s	Deploy vLLM to Kubernetes with GPU support, health probes, and OpenAI-compatible API endpoint.
vllm-deploy-simple	Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.
vllm-prefix-cache-bench	Benchmark the efficiency of vLLM automatic prefix caching using fixed prompts, real datasets, or synthetic prefix/suffix patterns.
vllm-bench-random-synthetic	Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT, TPOT, and other key performance metrics without downloading external datasets.
vllm-bench-serve	Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve.

Installation

Plugin Marketplace (Recommended)

Install directly from the plugin marketplace in Claude Code:

/plugin marketplace add vllm-project/vllm-skills
/plugin install vllm-skills@vllm-skills

Manual Install

Clone the repository and copy skills to your Claude Code skills directory:

git clone https://github.com/vllm-project/vllm-skills.git
cd vllm-skills

Copy to global skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple ~/.claude/skills/

Or copy to the project skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple .claude/skills/

Usage

Once installed, use the skills with slash commands or natural language:

/vllm-deploy-simple

Deploy vLLM with Qwen2.5-1.5B-Instruct on port 8000

Install and start a vLLM server using the vllm-deploy-simple skill

Supported Models

See vLLM documentation for the full list.

Contributing

This project follows the anthropics/skills template. When adding new skills:

Create a new directory under plugins/vllm-skills/skills/ (e.g., plugins/vllm-skills/skills/your-skill/)

Add a SKILL.md file with YAML frontmatter:

---
name: your-skill
description: Brief description of what this skill does
---

Add optional scripts/, references/, and assets/ directories
Update this README with your skill documentation

License

Licensed under the Apache License 2.0. See LICENSE.

Resources

Similar Plugins

fullstack-dev-skills

8.6k

204

Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.

Stats

Version1.0.0

Parent Repo Stars53

Parent Repo Forks16

Installs1

MaintenanceGood

AddedApr 4, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

vllm-skills53

vLLM Skills

A collection of skills for deploying and benchmarking vLLM. This project follows the anthropics/skills template format and is installable as a Claude Code plugin marketplace.

Overview

Skills Index

Skill	Description
vllm-deploy-docker	Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.
vllm-deploy-k8s	Deploy vLLM to Kubernetes with GPU support, health probes, and OpenAI-compatible API endpoint.
vllm-deploy-simple	Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.
vllm-prefix-cache-bench	Benchmark the efficiency of vLLM automatic prefix caching using fixed prompts, real datasets, or synthetic prefix/suffix patterns.
vllm-bench-random-synthetic	Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT, TPOT, and other key performance metrics without downloading external datasets.
vllm-bench-serve	Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve.

Installation

Plugin Marketplace (Recommended)

Install directly from the plugin marketplace in Claude Code:

/plugin marketplace add vllm-project/vllm-skills
/plugin install vllm-skills@vllm-skills

Manual Install

Clone the repository and copy skills to your Claude Code skills directory:

git clone https://github.com/vllm-project/vllm-skills.git
cd vllm-skills

Copy to global skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple ~/.claude/skills/

Or copy to the project skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple .claude/skills/

Usage

Once installed, use the skills with slash commands or natural language:

/vllm-deploy-simple

Deploy vLLM with Qwen2.5-1.5B-Instruct on port 8000

Install and start a vLLM server using the vllm-deploy-simple skill

Supported Models

See vLLM documentation for the full list.

Contributing

This project follows the anthropics/skills template. When adding new skills:

Create a new directory under plugins/vllm-skills/skills/ (e.g., plugins/vllm-skills/skills/your-skill/)

Add a SKILL.md file with YAML frontmatter:

---
name: your-skill
description: Brief description of what this skill does
---

Add optional scripts/, references/, and assets/ directories
Update this README with your skill documentation

License

Licensed under the Apache License 2.0. See LICENSE.

vllm-skills

Component Overview

Component Details

Skills (6)

README

vLLM Skills

Overview

Skills Index

Installation

Plugin Marketplace (Recommended)

Manual Install

Usage

Supported Models

Contributing

License

Resources

Similar Plugins

fullstack-dev-skills

vllm-skills

Component Overview

Component Details

Skills (6)

README

vLLM Skills

Overview

Skills Index

Installation

Plugin Marketplace (Recommended)

Manual Install

Usage

Supported Models

Contributing

License

Resources

Similar Plugins

fullstack-dev-skills

team-skills-platform

ui-ux-pro-max

context7-plugin

startup-business-analyst

brainstorming-skill