From honcho-dev
Integrates Honcho memory library into Python or TypeScript codebases for stateful AI agents. Guides SDK setup, peers/sessions config, dialectic chat endpoints, bot frameworks.
npx claudepluginhub plastic-labs/claude-honcho --plugin honcho-devThis skill is limited to using the following tools:
Honcho is an open source memory library for building stateful agents. It works with any model, framework, or architecture. You send Honcho the messages from your conversations, and custom reasoning models process them in the background — extracting premises, drawing conclusions, and building rich representations of each participant over time. Your agent can then query those representations on-d...
Guides building production AI agent harnesses from Claude Code patterns: async conversation loops, tool systems, permissions, memory, context compression, sub-agents, MCP integration. 15 chapters, 139 diagrams.
Interactively configures Honcho memory plugin settings via menus for peers, session mapping, workspace, host, context refresh, and message upload.
Guides claude-flow multi-agent orchestration: swarm vs hive-mind topology selection, agent deployment, memory configuration, SPARC workflows. Interactive recommendations without auto-execution.
Share bugs, ideas, or general feedback.
Honcho is an open source memory library for building stateful agents. It works with any model, framework, or architecture. You send Honcho the messages from your conversations, and custom reasoning models process them in the background — extracting premises, drawing conclusions, and building rich representations of each participant over time. Your agent can then query those representations on-demand ("What does this user care about?", "How technical is this person?") and get grounded, reasoned answers.
The key mental model: Peers are any participant — human or AI. Both are represented the same way. Observation settings (observe_me, observe_others) control which peers Honcho reasons about. Typically you want Honcho to model your users (observe_me=True) but not your AI assistant (observe_me=False). Sessions scope conversations between peers. Messages are the raw data you feed in — Honcho reasons about them asynchronously and stores the results as the peer's representation. No messages means no reasoning means no memory.
Your agent accesses this memory through peer.chat(query) (ask a natural language question, get a reasoned answer), session.context() (get formatted conversation history + representations), or both.
Follow these phases in order:
Before asking the user anything, explore the codebase to understand:
Use Glob and Grep to find:
**/*.py or **/*.ts files with "openai", "anthropic", "llm", "chat", "message"Bot framework detected? If the codebase is built around an agent loop, tool registry, session manager, and message bus (e.g., nanobot, openclaw, picoclaw), read
{baseDir}/references/bot-frameworks.mdfor framework-specific integration guidance and check{baseDir}/references/bot-frameworks/<framework>/for concrete reference implementations.
After exploring the codebase, use the AskUserQuestion tool to clarify integration requirements. Ask these questions (adapt based on what you learned in Phase 1):
Ask about which entities should be Honcho peers:
Ask how they want to use Honcho context:
Ask about conversation structure:
If they chose pre-fetch, ask what context matters:
Based on interview responses, implement the integration:
observe_me=False (unless user specifically wants AI observation)Check the latest SDK versions at https://docs.honcho.dev/changelog/introduction
honcho-ai@honcho-ai/sdkGet an API key ask the user to get a Honcho API key from https://app.honcho.dev and add it to the environment.
uv add honcho-ai
bun add @honcho-ai/sdk
TypeScript — The SDK is async by default. All methods return promises. No separate sync API.
Python — The SDK provides both sync and async interfaces:
from honcho import Honcho — use in sync frameworks (Flask, Django, CLI scripts)from honcho import Honcho with .aio namespace — use in async frameworks (FastAPI, Starlette, async workers)# Sync usage (Flask, Django, scripts)
from honcho import Honcho
honcho = Honcho(workspace_id="my-app", api_key=os.environ["HONCHO_API_KEY"])
peer = honcho.peer("user-123")
response = peer.chat("What does this user prefer?")
# Async usage (FastAPI, Starlette)
from honcho import Honcho
honcho = Honcho(workspace_id="my-app", api_key=os.environ["HONCHO_API_KEY"])
peer = honcho.aio.peer("user-123")
response = await peer.chat("What does this user prefer?")
Match the client to the framework — check whether the codebase uses async def handlers or sync def handlers and choose accordingly. The rest of this skill shows sync Python examples; swap to .aio equivalents for async codebases.
Use ONE workspace for your entire application. The workspace name should reflect your app/product.
Python:
from honcho import Honcho
import os
# Sync client (Flask, Django, scripts)
honcho = Honcho(
workspace_id="your-app-name",
api_key=os.environ["HONCHO_API_KEY"],
environment="production"
)
# Async client (FastAPI, Starlette) — use honcho.aio for all operations
# honcho.aio.peer(), honcho.aio.session(), etc.
TypeScript:
import { Honcho } from '@honcho-ai/sdk';
// All methods are async by default
const honcho = new Honcho({
workspaceId: "your-app-name",
apiKey: process.env.HONCHO_API_KEY,
environment: "production"
});
Create peers for every entity in your business logic - users AND AI assistants.
Python:
from honcho import PeerConfig
# Human users
user = honcho.peer("user-123")
# AI assistants - set observe_me=False so Honcho doesn't model the AI
assistant = honcho.peer("assistant", configuration=PeerConfig(observe_me=False))
support_bot = honcho.peer("support-bot", configuration=PeerConfig(observe_me=False))
TypeScript:
// Human users
const user = await honcho.peer("user-123");
// AI assistants - set observeMe=false so Honcho doesn't model the AI
const assistant = await honcho.peer("assistant", { configuration: { observeMe: false } });
const supportBot = await honcho.peer("support-bot", { configuration: { observeMe: false } });
Sessions can have multiple participants. Configure observation settings per-peer.
Python:
from honcho.api_types import SessionPeerConfig
session = honcho.session("conversation-123")
# User is observed (Honcho builds a model of them)
user_config = SessionPeerConfig(observe_me=True, observe_others=True)
# AI is NOT observed (no model built of the AI)
ai_config = SessionPeerConfig(observe_me=False, observe_others=True)
session.add_peers([
(user, user_config),
(assistant, ai_config)
])
TypeScript:
const session = await honcho.session("conversation-123");
await session.addPeers([
[user, { observeMe: true, observeOthers: true }],
[assistant, { observeMe: false, observeOthers: true }]
]);
Python:
session.add_messages([
user.message("I'm having trouble with my account"),
assistant.message("I'd be happy to help. What seems to be the issue?"),
user.message("I can't reset my password")
])
TypeScript:
await session.addMessages([
user.message("I'm having trouble with my account"),
assistant.message("I'd be happy to help. What seems to be the issue?"),
user.message("I can't reset my password")
]);
Make Honcho's chat endpoint available as a tool for your AI agent. This lets the agent query user context on-demand.
Python (OpenAI function calling):
import openai
from honcho import Honcho
honcho = Honcho(workspace_id="my-app", api_key=os.environ["HONCHO_API_KEY"])
# Define the tool for your agent
honcho_tool = {
"type": "function",
"function": {
"name": "query_user_context",
"description": "Query Honcho to retrieve relevant context about the user based on their history and preferences. Use this when you need to understand the user's background, preferences, past interactions, or goals.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "A natural language question about the user, e.g. 'What are this user's main goals?' or 'What communication style does this user prefer?'"
}
},
"required": ["query"]
}
}
}
def handle_honcho_tool_call(user_id: str, query: str) -> str:
"""Execute the Honcho chat tool call."""
peer = honcho.peer(user_id)
return peer.chat(query)
# Use in your agent loop
def run_agent(user_id: str, user_message: str):
messages = [{"role": "user", "content": user_message}]
response = openai.chat.completions.create(
model="gpt-4",
messages=messages,
tools=[honcho_tool]
)
# Handle tool calls
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
if tool_call.function.name == "query_user_context":
import json
args = json.loads(tool_call.function.arguments)
result = handle_honcho_tool_call(user_id, args["query"])
# Continue conversation with tool result...
TypeScript (OpenAI function calling):
import OpenAI from 'openai';
import { Honcho } from '@honcho-ai/sdk';
const honcho = new Honcho({
workspaceId: "my-app",
apiKey: process.env.HONCHO_API_KEY
});
const honchoTool: OpenAI.ChatCompletionTool = {
type: "function",
function: {
name: "query_user_context",
description: "Query Honcho to retrieve relevant context about the user based on their history and preferences.",
parameters: {
type: "object",
properties: {
query: {
type: "string",
description: "A natural language question about the user"
}
},
required: ["query"]
}
}
};
async function handleHonchoToolCall(userId: string, query: string): Promise<string> {
const peer = await honcho.peer(userId);
return await peer.chat(query);
}
For simpler integrations, fetch user context before the LLM call using pre-defined queries.
Python:
def get_user_context_for_prompt(user_id: str) -> dict:
"""Fetch key user attributes via targeted Honcho queries."""
peer = honcho.peer(user_id)
return {
"communication_style": peer.chat("What communication style does this user prefer? Be concise."),
"expertise_level": peer.chat("What is this user's technical expertise level? Be concise."),
"current_goals": peer.chat("What are this user's current goals or priorities? Be concise."),
"preferences": peer.chat("What key preferences should I know about this user? Be concise.")
}
def build_system_prompt(user_context: dict) -> str:
return f"""You are a helpful assistant. Here's what you know about this user:
Communication style: {user_context['communication_style']}
Expertise level: {user_context['expertise_level']}
Current goals: {user_context['current_goals']}
Key preferences: {user_context['preferences']}
Tailor your responses accordingly."""
TypeScript:
async function getUserContextForPrompt(userId: string): Promise<Record<string, string>> {
const peer = await honcho.peer(userId);
const [style, expertise, goals, preferences] = await Promise.all([
peer.chat("What communication style does this user prefer? Be concise."),
peer.chat("What is this user's technical expertise level? Be concise."),
peer.chat("What are this user's current goals or priorities? Be concise."),
peer.chat("What key preferences should I know about this user? Be concise.")
]);
return {
communicationStyle: style,
expertiseLevel: expertise,
currentGoals: goals,
preferences: preferences
};
}
Use context() for conversation history with built-in LLM formatting.
Python:
import openai
session = honcho.session("conversation-123")
user = honcho.peer("user-123")
assistant = honcho.peer("assistant", configuration=PeerConfig(observe_me=False))
# Get context formatted for your LLM
context = session.context(
tokens=2000,
peer_target=user.id, # Include representation of this user
summary=True # Include conversation summaries
)
# Convert to OpenAI format
messages = context.to_openai(assistant=assistant)
# Or Anthropic format
# messages = context.to_anthropic(assistant=assistant)
# Add the new user message
messages.append({"role": "user", "content": "What should I focus on today?"})
response = openai.chat.completions.create(
model="gpt-4",
messages=messages
)
# Store the exchange
session.add_messages([
user.message("What should I focus on today?"),
assistant.message(response.choices[0].message.content)
])
TypeScript:
import OpenAI from 'openai';
const session = await honcho.session("conversation-123");
const user = await honcho.peer("user-123");
const assistant = await honcho.peer("assistant", { configuration: { observeMe: false } });
// Get context formatted for your LLM
const context = await session.context({
tokens: 2000,
peerTarget: user.id, // Include representation of this user
summary: true // Include conversation summaries
});
// Convert to OpenAI format
const messages = context.toOpenAI(assistant);
// Or Anthropic format
// const messages = context.toAnthropic(assistant);
// Add the new user message
messages.push({ role: "user", content: "What should I focus on today?" });
const openai = new OpenAI();
const response = await openai.chat.completions.create({
model: "gpt-4",
messages
});
// Store the exchange
await session.addMessages([
user.message("What should I focus on today?"),
assistant.message(response.choices[0].message.content!)
]);
Python:
stream = peer.chat_stream("What do we know about this user?")
for chunk in stream:
print(chunk, end="", flush=True)
TypeScript:
const stream = await peer.chatStream("What do we know about this user?");
for await (const chunk of stream) {
process.stdout.write(chunk);
}
When integrating Honcho into an existing codebase:
uv add honcho-ai (Python) or bun add @honcho-ai/sdk (TypeScript)HONCHO_API_KEY environment variableobserve_me=False for AI peersobserve_me=False for AI peers unless you specifically want Honcho to model your AI's behavioradd_messages() to feed Honcho's reasoning engine