Search everything...

Stats

Actions

Available In

DeepEval

Name: DeepEval
Author: confident-ai

By confident-ai

Evaluate and improve LLM applications by instrumenting agents, chatbots, and RAG pipelines with DeepEval tracing, generating test suites, running evaluations, and exporting traces to Confident AI for observability and iterative refinement.

npx claudepluginhub confident-ai/deepeval --plugin deepeval

Popularity

Stars

Top 1%

16,733

Med: 0·Avg: 452

Installs

Top 10%

Med: 0·Avg: 2

What's Inside

Skills3

deepeval-otel

/deepeval-otel

Export raw OpenTelemetry traces from an AI application to Confident AI's Observatory. TRIGGER when the user wants to send OpenTelemetry or OTLP traces/spans from an LLM app, agent, RAG pipeline, or chatbot to Confident AI; configure the Confident AI OTLP endpoint; set confident.span.* or confident.trace.* attributes; export AI-app traces to Confident AI without the deepeval Python package; wire an OTLPSpanExporter, OpenTelemetry Collector, or vendor-neutral OTel SDK to Confident AI; or pick the US vs EU Confident AI OTLP endpoint. Language-agnostic — the mechanism is OTLP attribute keys plus an exporter endpoint. DO NOT TRIGGER for building DeepEval pytest eval suites, datasets, goldens, metrics, or deepeval test run (use the `deepeval` skill); for instrumenting with the DeepEval SDK's @observe decorator or framework integrations (use the `deepeval-tracing` skill); or for instrumenting non-AI software such as web servers, CRUD backends, or infrastructure — the confident.* attributes describe AI components (agents, LLM calls, retrievers, tools) and apply to AI applications only.

deepeval-tracing

/deepeval-tracing

Instrument an AI application with DeepEval's native tracing so its behavior is visible in Confident AI. TRIGGER when the user wants to add DeepEval tracing or @observe to an LLM app, agent, RAG pipeline, or chatbot; wire a framework, model-provider, or vector-database integration (LangGraph, LangChain, OpenAI Agents, LlamaIndex, Pydantic AI, CrewAI, and others); choose between a native integration and manual instrumentation; set span types, tags, or metadata; or send DeepEval-SDK traces to Confident AI's Observatory. DO NOT TRIGGER for building DeepEval pytest eval suites, datasets, goldens, metrics, or deepeval test run (use the `deepeval` skill), or for raw OpenTelemetry / OTLP export without the deepeval package (use the `deepeval-otel` skill). This skill is purely DeepEval-SDK instrumentation — producing well-formed traces, not running evals.

deepeval

/deepeval

DeepEval evaluation workflow for AI agents and LLM applications. TRIGGER when the user wants to evaluate or improve an AI agent, tool-using workflow, multi-turn chatbot, RAG pipeline, or LLM app; add evals; generate datasets or goldens; use deepeval generate; use deepeval test run; send results to Confident AI; monitor production; run online evals; inspect traces; or iterate on prompts, tools, retrieval, or agent behavior from eval failures. AI agents are the primary use case. Covers Python SDK, pytest eval suites, CLI generation, traced evals, Confident AI reporting, and agent-driven improvement loops. DO NOT TRIGGER for unrelated generic pytest, non-AI test setup, or non-DeepEval observability work unless the user asks to compare or migrate to DeepEval; for instrumenting an app with DeepEval tracing, @observe, or framework integrations (use the `deepeval-tracing` skill); or for raw OpenTelemetry / OTLP export without the deepeval package (use the `deepeval-otel` skill).

Stats

Version1.0.0

ReleasedMay 28, 2026

LanguagePython

Stars16,733

Forks1,634

Copy clicks1

MaintenanceExcellent

LicenseApache-2.0

Last CommitJul 9, 2026

AddedMay 21, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

deepeval-plugins16,704

DeepEval

Popularity

What's Inside

Confidence

README

The LLM Evaluation Framework

Documentation | Metrics and Features | Getting Started | Integrations | Confident AI

🔥 Metrics and Features

Similar Plugins

evaluate-agent

evals-skills

promptfoo-evals

langsmith-skills

opik

langfuse-pack

More by confident-ai

Confident AI Client

The LLM Evaluation Framework

Documentation | Metrics and Features | Getting Started | Integrations | Confident AI

🔥 Metrics and Features

Popularity

Health & Quality

More by confident-ai

Confident AI Client

Similar Plugins

evaluate-agent

evals-skills

promptfoo-evals

langsmith-skills

opik

langfuse-pack