By langchain-ai
Instrument Python and JavaScript LLM apps with LangSmith tracing using LangChain auto-tracing, decorators, or OpenTelemetry; create, manage, and upload evaluation datasets; build custom evaluators like LLM-as-Judge; run evaluations locally via SDK or CLI, and query/export traces.
npx claudepluginhub langchain-ai/langsmith-skillsINVOKE THIS SKILL when creating evaluation datasets, uploading datasets to LangSmith, or managing existing datasets. Covers dataset types (final_response, single_step, trajectory, RAG), CLI management commands, SDK-based creation, and example management. Uses the langsmith CLI tool.
INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.
INVOKE THIS SKILL when working with LangSmith tracing OR querying traces. Covers adding tracing to applications and querying/exporting trace data. Uses the langsmith CLI tool.
⚠️ — This project is in early development. APIs and skill content may change.
Agent skills for observing and evaluating LLM applications with LangSmith. Query traces, build evaluation datasets, and create custom evaluators — all from your coding agent.
Looking for skills to build and improve agents with LangChain, LangGraph, or Deep Agents? See langchain-skills.
These skills can be installed for any agent supported by skills.sh, including Claude Code, Deep Agents CLI, Cursor, Windsurf, Goose, and many more.
Using npx skills:
Local (current project):
npx skills add langchain-ai/langsmith-skills --skill '*' --yes
Global (all projects):
npx skills add langchain-ai/langsmith-skills --skill '*' --yes --global
To link skills to a specific agent (e.g. Claude Code):
npx skills add langchain-ai/langsmith-skills --agent claude-code --skill '*' --yes --global
Install directly as a Claude Code plugin:
/plugin marketplace add langchain-ai/langsmith-skills
/plugin install langsmith-skills@langsmith-skills
Alternatively, clone the repo and use the install script:
# Install for Claude Code in current directory (default)
./install.sh
# Install for Claude Code in a specific project directory
./install.sh ~/my-project
# Install for Claude Code globally
./install.sh --global
# Install for DeepAgents CLI in a specific project directory
./install.sh --deepagents ~/my-project
# Install for DeepAgents CLI globally (includes agent persona)
./install.sh --deepagents --global
| Flag / Argument | Description |
|---|---|
DIRECTORY | Target project directory (default: current directory, ignored with --global) |
--claude | Install for Claude Code (default) |
--deepagents | Install for DeepAgents CLI |
--global, -g | Install globally instead of current directory |
--force, -f | Overwrite skills with same names as this package |
--yes, -y | Skip confirmation prompts |
After installation, set your API keys:
export LANGSMITH_API_KEY=<your-key>
export OPENAI_API_KEY=<your-key> # For OpenAI models
export ANTHROPIC_API_KEY=<your-key> # For Anthropic models
Then run your coding agent from the directory where you installed (for local installs) or from anywhere (for global installs).
Note: All skills include Python and TypeScript helper scripts for common operations.
Agent configuration lives in config/. To update an existing installation:
./install.sh --force
Official skill collection from Langfuse for LLM observability, monitoring, and debugging within Claude workflows.
Share bugs, ideas, or general feedback.
Claude Code skill pack for Langfuse LLM observability (24 skills)
Skills for building LLM evaluations: pipeline audit, error analysis, synthetic data generation, LLM-as-Judge design, evaluator validation, RAG evaluation, and annotation interfaces.
LLM observability tooling for agent development and Claude Code
Traces Claude Code conversations to LangSmith, including subagent and tool executions
Add Arize AX observability to LLM applications — auto-instrumentation, trace export, dataset management, experiment workflows, prompt optimization, and deep linking via the ax CLI.