AI agent observability and shared memory plugins by Pentatonic
npx claudepluginhub pentatonic-ltd/ai-agent-sdkAI agent memory — local or hosted. 7-layer hybrid retrieval (BM25 + vector + KG + reranker) backed by the Pentatonic memory engine.
Share bugs, ideas, or general feedback.
Observability, memory, and analytics for LLM applications.
Provider-agnostic. JavaScript & Python.
The Pentatonic AI Agent SDK instruments your LLM applications with zero-config observability. Wrap any OpenAI, Anthropic, or Cloudflare Workers AI client and get:
npx @pentatonic-ai/ai-agent-sdk init
This walks you through account creation, email verification, and API key generation. You'll get:
TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx
npm install @pentatonic-ai/ai-agent-sdk
pip install pentatonic-ai-agent-sdk
JavaScript
import { TESClient } from "@pentatonic-ai/ai-agent-sdk";
const tes = new TESClient({
clientId: process.env.TES_CLIENT_ID,
apiKey: process.env.TES_API_KEY,
endpoint: process.env.TES_ENDPOINT,
});
// Auto-instruments every create() call
const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
});
Python
from pentatonic_agent_events import TESClient
tes = TESClient(
client_id=os.environ["TES_CLIENT_ID"],
api_key=os.environ["TES_API_KEY"],
endpoint=os.environ["TES_ENDPOINT"],
)
ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
That's it. Every call emits a CHAT_TURN event with token usage, tool calls, and model info.
| Provider | Detection | Intercepted Method |
|---|---|---|
| OpenAI | client.chat.completions.create | chat.completions.create() |
| Anthropic | client.messages.create | messages.create() |
| Workers AI | client.run (JS only) | run() |
All other methods pass through unchanged.
For multi-round tool loops, just keep calling the wrapped client. Each call emits its own event, linked by sessionId:
const ai = tes.wrap(new OpenAI(), { sessionId: "conv-101" });
// Round 1: AI requests a tool call
const r1 = await ai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Find me running shoes" }],
tools: [searchTool],
});
// Execute tool, feed results back...
// Round 2: AI responds with final answer
const r2 = await ai.chat.completions.create({
model: "gpt-4o",
messages: [...messages, { role: "tool", content: toolResult }],
});
// No manual emit needed. Both events share sessionId "conv-101".
If you need full control over when events are emitted:
const session = tes.session({ sessionId: "conv-123" });
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What is 2+2?" }],
});
session.record(response);
await session.emitChatTurn({
userMessage: "What is 2+2?",
assistantResponse: response.choices[0].message.content,
});
Track every Claude Code conversation automatically with shared team memory.
/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk