@voightxyz/openai or @voightxyz/anthropic.
The section is separate from the coding-agent dashboard on purpose. Your team’s Claude Code / Cursor / Codex telemetry and your production LLM app’s user-facing telemetry are different operational questions — different cadences, different SLOs, different people read them. Keeping them in their own surface lets each one stay opinionated about what it shows.
What you get out of the box
Wrap your client once, deploy, and the dashboard populates within seconds. Five sub-tabs aggregate the same event stream from different angles:| Sub-tab | The question it answers |
|---|---|
| Overview | ”How is this app doing right now?” — cost, traffic, error rate, latency, all with prior-window deltas |
| Traces | ”What happened in this specific request?” — every withTrace block as a card, drillable to individual events |
| Models | ”Which model is eating my budget?” — cost & token mix by model, ranked |
| Tools | ”Which functions does the model call most?” — tool-use frequency, success rate, p50/p95 duration |
| Users | ”Which end-user costs me money?” — per-user spend, traces, tokens (powered by per-user tags) |
The data model behind the section
Every event in AI Apps comes from one of two sources:- Wrapped LLM calls — captured automatically by
@voightxyz/openaior@voightxyz/anthropicwhen you wrap your client. One event perchat.completions.create,responses.create, ormessages.createcall. Carries model, tokens (input/output/cache reads/cache creations), tool calls, latency, finish reason, and (if you callwithTrace) yourtagsmap. log()events insidewithTrace— when you calllog()from inside awithTraceblock to mark a domain event ("user submitted form","retrieval returned 0 results"). Stamped with the sametagsas the surrounding LLM calls, buttype: 'log'so they’re distinguishable.
framework is openai or anthropic are included, so your coding-agent traffic never bleeds in.
Filter pills
Three filters live in the top bar, all composable:- Agent — narrow to a single wrapped agent (e.g.
production-chat-apivsinternal-summarizer-bot). All agents using@voightxyz/openaior@voightxyz/anthropicappear here. - Provider — narrow to OpenAI or Anthropic only. Useful when one wrapper agent calls both providers (it happens — multi-provider routers, fallback chains) and you want to isolate one.
- User — narrow to a single
tags.userIdvalue. Drives the Users sub-tab but is also available globally so you can see Models / Tools / Traces scoped to one customer.
/dashboard/ai-apps?provider=anthropic&tag.plan=enterprise and the recipient lands on the same view.
Provider filter: per-event, not per-agent
A subtle bug we hit early: a single agent that calls both OpenAI and Anthropic (failover, A/B routing) getsagent.framework pinned to whichever wrapper initialized first. If you filtered by agent.framework, choosing “OpenAI” would still show Claude models from that same agent.
The fix shipped in May 2026: provider filtering uses metadata.source (per-event, set by each wrapper) instead of agent.framework (per-agent, first-seen). Multi-provider agents now filter cleanly — choose OpenAI and you see only events where the wrapper that emitted them was @voightxyz/openai.
Overview sub-tab
The default landing tab. Three regions:- KPI strip — cost USD, total calls, error rate, p50 latency, p95 latency. Each KPI carries a delta vs the prior identical window (
28d vs prior 28dby default; toggleable to 7d / 1d). - Activity pulse — stacked-area chart of traffic over time, coloured by provider. The pulse refreshes every 15s so you can watch live traffic during a deploy.
- Top models / Top tools / Top users — three compact leaderboards as entry points to the deeper sub-tabs.
Traces sub-tab
AwithTrace block becomes one trace card. The card shows:
- Route tag (
POST /api/chat) - Total duration of the block
- Cost USD
- Number of LLM calls, tool calls, errors
- User tag (if set) — clickable to filter the whole section to that user
chat.completions.create, every tool call, every log() event, in order, with prompts and responses revealed by an eye toggle (masked by default, same UX as the coding-agent traces).
Derived child spans: if a model response includes tool_calls, the dashboard renders each one as a synthetic child span under the parent LLM call — even though they’re not separate events on the wire. Mirror what most engineers expect from APM tools.
Models sub-tab
One row per(provider, model) pair active in the time window:
- Cost USD (sum)
- Calls (count)
- Tokens (input + output + cache reads + cache creations, decomposed)
- p50 / p95 latency
- Share of overall cost as a sparkline
Tools sub-tab
One row per uniquetool.name the model invoked:
- Invocations (count)
- Success rate (
outcome === 'success'/ total) - p50 / p95 duration of the tool execution itself
- Last seen timestamp
- Top users for this tool (mini-leaderboard inline)
tool_use) and any explicit log({ type: 'tool', toolExecuted }) calls inside withTrace.
Users sub-tab
Powered by per-user tags. See that page for the 1-line code pattern that makes this tab populate. Key columns:- User (the
tags.userIdvalue, verbatim) - Spend (USD)
- Traces (
withTraceblock count) - Tokens (total)
- Last seen
- Top model
tags.userId yet, with a deep link to the per-user-spend setup guide.
Privacy in AI Apps
The wrapper SDKs ship with the same three privacy levels as the rest of Voight — Minimal, Standard, Full. Set on wrap:| Level | Prompts & responses | Tool arguments | Tokens & cost | Tags |
|---|---|---|---|---|
| Minimal | dropped | dropped | kept | kept |
| Standard ★ | scrubbed via scrubPii() | scrubbed | kept | kept |
| Full | verbatim | verbatim | kept | kept |
Time window
The window selector in the top-right controls every panel in the section. Defaults to 28 days. Available windows: 1 day, 7 days, 14 days, 28 days, 90 days. All deltas (the green/red percentage chips on KPIs) compare to the prior identical window —28d vs prior 28d, 7d vs prior 7d, etc.
Retention cap is governed by your pricing tier — events older than your retention window are purged.
How AI Apps relates to the rest of the dashboard
| Dashboard surface | What it shows | When you reach for it |
|---|---|---|
| AI Apps (this page) | Production LLM apps using the wrapper SDKs | Customer-facing copilots, agentic features, B2B AI products |
Overview (/dashboard) | Coding-agent sessions (Claude Code, Cursor, Codex) + library-mode bots | Your team’s dev workflow telemetry |
| Audit log | Every event across every surface, filterable | ”Show me what happened at 3:42pm” forensics |
| Sessions | Per-process timelines | Long-running autonomous agents, multi-hour Claude Code runs |
| Traces | Per-prompt timelines | One agent turn from prompt to final response |
Explorer (/explore) | Public Solana agent registry | On-chain identity, reputation, x402 |
API access
Every panel is backed by a public endpoint under/v1/me/ai-apps/*:
Next
- Per-user spend — the killer feature; one line of code to populate the Users tab
- Tracing —
withTrace+logAPI in full - OpenAI SDK — wrap your OpenAI client
- Anthropic SDK — wrap your Anthropic client