Skill

langfuse-cost-tuning

Tracks and optimizes LLM costs using Langfuse analytics, Metrics API, model routing, and budget alerts for spending analysis and anomaly detection in AI apps.

TypeScript

Popularity

Parent stars

2,199

Parent forks

296

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/langfuse-pack:langfuse-cost-tuning

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadWriteEdit

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Track, analyze, and optimize LLM costs using Langfuse's built-in token/cost tracking, the Metrics API for programmatic cost analysis, model routing for cost reduction, and automated budget alerts.

Supporting Files

references/implementation.md

SKILL.md

262 lines · ~2.2k tokens

Stats

LanguagePython

Parent stars2,199

Parent forks296

MaintenanceExcellent

Last CommitApr 3, 2026

Actions

View Source View Plugin View on GitHub View README

Strategy	Savings	Effort	How
Model downgrade	50-95%	Low	Route simple tasks to `gpt-4o-mini`
Prompt optimization	10-30%	Low	Remove filler words, use structured prompts
Response caching	20-80%	Medium	Cache identical prompts with TTL
Batch processing	50%	Medium	Use OpenAI Batch API for offline tasks
Token limits	10-40%	Low	Set `max_tokens` on all calls

Issue	Cause	Solution
Missing cost data	No `usage` in generation	Ensure `usage` is included with `promptTokens`/`completionTokens`
Wrong cost calculation	Model name mismatch	Use exact model ID (e.g., `gpt-4o-2024-08-06`)
Custom model no cost	No pricing configured	Add model pricing in Langfuse Settings > Model Definitions
Stale pricing	Model prices changed	Update model definitions periodically

Strategy	Savings	Effort	How
Model downgrade	50-95%	Low	Route simple tasks to `gpt-4o-mini`
Prompt optimization	10-30%	Low	Remove filler words, use structured prompts
Response caching	20-80%	Medium	Cache identical prompts with TTL
Batch processing	50%	Medium	Use OpenAI Batch API for offline tasks
Token limits	10-40%	Low	Set `max_tokens` on all calls

Issue	Cause	Solution
Missing cost data	No `usage` in generation	Ensure `usage` is included with `promptTokens`/`completionTokens`
Wrong cost calculation	Model name mismatch	Use exact model ID (e.g., `gpt-4o-2024-08-06`)
Custom model no cost	No pricing configured	Add model pricing in Langfuse Settings > Model Definitions
Stale pricing	Model prices changed	Update model definitions periodically

langfuse-cost-tuning

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

langfuse-cost-tuning

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Langfuse Cost Tuning

Overview

Prerequisites

How Langfuse Tracks Costs

Instructions

Step 1: Ensure Token Usage is Captured

Step 2: Query Costs via Metrics API

Step 3: Implement Smart Model Routing

Step 4: Budget Alerts

Langfuse Dashboard Features

Cost Optimization Strategies

Error Handling

Resources

Similar Skills

Langfuse Cost Tuning

Overview

Prerequisites

How Langfuse Tracks Costs

Instructions

Step 1: Ensure Token Usage is Captured

Step 2: Query Costs via Metrics API

Step 3: Implement Smart Model Routing

Step 4: Budget Alerts

Langfuse Dashboard Features

Cost Optimization Strategies

Error Handling

Resources

Similar Skills