🎣 Hook
“What gets measured gets improved.” Yet most AI-assisted dev workflows remain opaque. One day you’re pair-programming with Claude; the next, your token bill looks suspiciously high and nobody can explain why.
The Claude Code Observability Stack solves that by turning hidden costs and performance quirks into clear Grafana dashboards—vendor-neutral, open source, and deployable in minutes.
🏗️ Why we built it
In our consulting work with Seed-to-Series B teams, two questions surface again and again:
- Are we actually faster with Claude, or just busier?
- Where did last night’s $412 in token spend originate?
Answering requires real telemetry—sessions, tokens, cost, tool usage, latency—made visible to both engineers and finance. The OSS we needed didn’t exist, so we created it.
🛠️ What’s inside the repo
Layer | Tech | Purpose |
---|---|---|
Telemetry ingest | OpenTelemetry Collector | Unified metrics & logs |
Metrics store | Prometheus | Time-series powerhouse |
Log store | Loki | Structured event search |
Visualization | Grafana | Pre-wired dashboard |
DX helpers | Makefile + Docker Compose | make up → full stack |
MIT-licensed, ~50 MB container images—the stack is live in under 90 seconds on a laptop.
🚀 Quick start
git clone https://github.com/ColeMurray/claude-code-otel.git
cd claude-code-otel
make up # Stack online: Grafana → :3000, Prometheus → :9090
Point Claude Code to the collector:
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
claude
Open Grafana and watch data flow; the dashboard refreshes every 30 seconds by default.
📊 Dashboard tour
- 💰 Cost & Usage – spend by model, token type, and time window.
- 🔧 Tool Performance – frequency and success of each Claude tool.
- ⚡ Latency & Errors – surface slowdowns and HTTP issues quickly.
- 📝 Productivity Metrics – commits, PRs, lines added/removed.
- 🔍 Event Logs – jump from a metric spike straight to the log entry in Loki.
Queries follow OTel best practices (low-cardinality labels, efficient rates), so the stack scales to hundreds of active sessions.
Key extras
- Cardinality toggles – drop session IDs or account UUIDs with env vars.
- Multi-exporter support – ship metrics to Prometheus and Datadog if needed.
- Privacy guardrails – prompt text is redacted by default; enable only when audits demand it.
- MDM-friendly – organization-wide settings via JSON, perfect for larger enterprises.
Early-stage outcomes
Day | Result |
---|---|
1 | Finance sees spend by model—cost conversations become data-driven. |
3 | Alerting on token spikes halts runaway jobs in minutes. |
5 | Product demonstrates that Claude sessions correlate with a 23 % uptick in merged PRs. |
About us
We’re a fractional-CTO and AI product studio that prefers observability over guesswork. Rather than billing by the hour, we deliver outcomes: faster releases, predictable costs, happier engineers. Tools like this stack make those outcomes measurable—and repeatable.
Need help integrating it with Kubernetes or mapping cost to cost centers? Let’s talk.
Get the code
github.com/ColeMurray/claude-code-otel
Because if you can’t see your AI workflows, you can’t scale them.