Skill

map

From smg

Maps SMG Rust codebase crates to roles, subsystems, key types, and dependencies. Use to understand structure and ownership before changes.

Rust

npx claudepluginhub lightseekorg/smg-dev-guide --plugin smg

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/smg:map

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

High-performance Rust gateway for LLM inference backends. Routes requests to workers running vLLM, SGLang, TensorRT-LLM with 8 routing policies, KV cache optimization, K8s service discovery, WASM plugins, MCP tool execution, and mesh HA.

SKILL.md

106 lines · ~1.4k tokens

Similar Skills

implement

Enforces step-by-step implementation workflow for features, bug fixes, and changes in SMG repository by detecting subsystem (config, routing, gRPC, bindings, K8s, storage) and loading specific recipes with verifications.

15 files

smg

rust-cargo-assistant

Assists with Cargo.toml configuration, crate dependency management, project initialization, builds, tests, benchmarks, docs, troubleshooting, and best practices for Rust projects.

devkit

rust-pro

38.6k

Provides expert guidance on Rust 1.75+ for building services, libraries, systems tooling with async patterns (Tokio/axum), advanced types, ownership, lifetimes, and performance optimization.

antigravity-awesome-skills

Stats

Stars11

Forks1

MaintenanceExcellent

Last CommitMar 9, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

SMG Codebase Map

What Is SMG?

Crate Map

Crate	Role	Key Types
`model_gateway`	Main binary. HTTP/gRPC handlers, routing engine, service discovery, observability, CLI	`RouterConfig`, `ServerConfig`, `CliArgs`
`protocols`	OpenAI-compatible types shared by ALL consumers (config, bindings, API). Sacred — no impl-specific fields.	`WorkerSpec`, `ModelCard`, `WorkerModels`, `ChatCompletionRequest/Response`
`kv_index`	KV cache-aware routing. Radix trees (String for HTTP, Token for gRPC), positional indexer	`StringTree`, `TokenTree`, `RadixTree` trait, `PositionalIndexer`
`auth`	API key (SHA-256 hashed), JWT/OIDC, role-based access (Admin/User), audit logging	`JwtConfig`, `ApiKeyEntry`, `Principal`, `Role`
`mesh`	HA cluster via SWIM gossip. CRDT KV store, partition detection, consistent hashing	`ClusterState`, `WorkerState`, `NodeStatus`
`wasm`	WebAssembly plugin system. WIT interface, middleware hooks (OnRequest/OnResponse), LRU cache	`WasmModule`, `Action` (Continue/Reject/Modify)
`mcp`	MCP protocol client. Tool discovery, execution, approval workflows, response format translation	`McpConfig`, `McpOrchestrator`, `ToolAnnotations`
`grpc_client`	gRPC client for backends. Macros for dedup, streaming, trace injection	`SglangGrpcClient`, `VllmGrpcClient`
`data_connector`	Pluggable storage: PostgreSQL, Oracle, Redis, in-memory. Hook system for interception	`StorageBackend` trait, `StorageHook`
`tool_parser`	13+ tool call parsers (JSON, Mistral, Qwen, DeepSeek, Pythonic, etc.). Streaming with incremental JSON	`ToolParser` trait, `ParserFactory`, `StreamingParseResult`
`reasoning_parser`	Reasoning extraction from 10+ model families (DeepSeek-R1, Qwen3, Kimi, Cohere). Streaming	`ReasoningParser` trait, `ParserFactory`, `ParserResult`
`tokenizer`	LLM tokenization, chat templates	`Tokenizer`
`multimodal`	Image/audio processing. Per-model vision specs (LLaVA, Qwen-VL, Llama4, Phi3-V), media fetching	`ImageFrame`, `ChatContentPart`, `MediaConnector`
`workflow`	Step-based async workflow engine (wfaas)	`StepExecutor`, `WorkflowContext`
`bindings/python`	PyO3 bindings. `Router` class with ~80 constructor params, enum mapping	`Router`, `PolicyType`
`bindings/golang`	Go SDK via FFI (cgo). OpenAI-style API, streaming, tool calling	`Client`, `ChatCompletionRequest`
`clients/rust`	Rust client library
`grpc_servicer`	Python gRPC servicer wrapping vLLM/SGLang backends

Layering Rule

crates/protocols (shared types — ALL consumers)
    ↑
model_gateway (implementation — ONE consumer writes each field)
    ↑
bindings/* (language SDKs — wrap model_gateway + protocols)

Directory layout: Library crates live under crates/ (e.g. crates/mcp/, crates/mesh/). model_gateway/ and bindings/ remain at repo root.

Iron law: If only one crate writes a field, it doesn't belong in crates/protocols/. K8s-specific, runtime-specific, or gateway-specific fields stay in model_gateway.

Config Propagation (3-Stage)

CLI args (main.rs CliArgs) + YAML file (RouterConfig)
    ↓ merge (CLI overrides file)
DiscoveryConfig / RouterConfig (config/types.rs) — serde-friendly, user-facing
    ↓ convert in main.rs (TWO paths: to_router_config + to_server_config)
ServiceDiscoveryConfig / ServerConfig — typed, runtime

Both conversion paths in main.rs must stay in sync. Miss one = CLI flag or config file silently ignored.

Request Flow

Client → HTTP/gRPC handler → Auth middleware → WASM OnRequest
  → Routing policy selects worker → Proxy to backend
  → Stream response → Tool/reasoning parsing → WASM OnResponse → Client

Realtime (WebSocket):
Client → WS upgrade → Realtime session registry → Proxy to backend WS

Worker Lifecycle (5-Step Workflow)

K8s Pod → PodInfo::from_pod() → handle_pod_event() → Job::AddWorker
  Step 1: Detect Runtime (sglang/vllm/trt)
  Step 2: Discover Connection Mode (HTTP/gRPC)
  Step 3: Discover DP Info (rank/size)
  Step 4: Discover Metadata → flattens into labels HashMap
  Step 5: Create Worker → merge labels, resolve model_id, build ModelCard

The Label Pipeline

Central integration pattern. All worker metadata flows as key-value labels:

Source: Backend HTTP endpoints (flattened JSON → HashMap)
Override: WorkerSpec.labels from config (takes precedence)
Consumed: create_worker.rs reads labels to build ModelCard
To inject metadata: add as label — pipeline handles merging

Essential Commands

cargo +nightly fmt --all                                      # Format
cargo clippy --all-targets --all-features -- -D warnings      # Lint
cargo test                                                     # Test
make python-dev                                                # Python bindings
make pre-commit                                                # All checks

Next Steps

Implementing? Use smg:implement — detects the subsystem and loads step-by-step recipes with verification.
Preparing to ship? Use smg:contribute — enforces quality gates before PR.
Reviewing a PR? Use smg:review-pr — systematic checklist mapped to changed subsystems.

map

Popularity

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

map

Popularity

Invocation

Context Preview

SKILL.md

SMG Codebase Map

What Is SMG?

Crate Map

Layering Rule

Config Propagation (3-Stage)

Request Flow

Worker Lifecycle (5-Step Workflow)

The Label Pipeline

Essential Commands

Next Steps

Similar Skills

Help us improve

SMG Codebase Map

What Is SMG?

Crate Map

Layering Rule

Config Propagation (3-Stage)

Request Flow

Worker Lifecycle (5-Step Workflow)

The Label Pipeline

Essential Commands

Next Steps