> ## Documentation Index
> Fetch the complete documentation index at: https://docs.voight.xyz/llms.txt
> Use this file to discover all available pages before exploring further.

# OpenAI SDK

> Wrap your OpenAI client and capture every model call without changing app code.

`@voightxyz/openai` instruments the official OpenAI Node SDK. Wrap your client once and every `chat.completions.create` or `responses.create` call — non-streaming or streaming — lands in Voight with prompts, tokens, cache reads, tool calls, latency, and errors.

This is the same SDK-instrumentation model Sentry / Vercel AI / LangChain use. Same backend, same dashboard as [`@voightxyz/anthropic`](/ai-apps/anthropic) and [library mode](/sdk/library-mode) — events from all three land side-by-side under the same agent.

## Quick setup (recommended)

From the root of your app:

```bash theme={null}
npx -y @voightxyz/sdk init
```

The [wizard](/ai-apps/wizard) detects `openai` in your `package.json`, prompts for your Voight key + privacy level, validates the key, and writes a ready-to-import `src/lib/voight.ts` with the wrapped client. 30 seconds, zero copy-paste.

Continue below if you'd rather wire it manually.

## Install

```bash theme={null}
npm install openai @voightxyz/openai
```

Requirements:

* Node.js 18+ (uses global `fetch`)
* `openai` SDK 4.0.0+

## Quick start

```ts theme={null}
import OpenAI from 'openai'
import { wrapOpenAI } from '@voightxyz/openai'

const client = wrapOpenAI(new OpenAI(), {
  voightApiKey: process.env.VOIGHT_KEY,
  agent: 'my-prod-agent',
})

const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }],
})
```

That's it. Every call is captured automatically. Visit your [dashboard](https://voight.xyz/dashboard/ai-apps) to see them in the **AI Apps** section.

## Tracing & per-user tags

For production apps, wrap each request boundary with `withTrace` to group every LLM call inside one request into one trace, and to attribute cost per end-user with one line of code:

```ts theme={null}
import OpenAI from 'openai'
import { wrapOpenAI, withTrace, log } from '@voightxyz/openai'

const openai = wrapOpenAI(new OpenAI(), { agent: 'production-chat-api' })

app.post('/api/chat', async (req, res) => {
  await withTrace(
    async () => {
      log('chat request received')

      const reply = await openai.chat.completions.create({
        model: 'gpt-4o-mini',
        messages: [{ role: 'user', content: req.body.prompt }],
      })

      res.json({ reply })
    },
    {
      routeTag: 'POST /api/chat',
      tags: { userId: req.user.id, plan: req.user.plan },
    },
  )
})
```

Every wrapped LLM call inside the `withTrace` block gets stamped with `metadata.tags = { userId, plan, ... }` automatically. The dashboard's [AI Apps section](/ai-apps/overview) then surfaces:

* **Users sub-tab** — per-user spend, traces, tokens (driven by `tags.userId`)
* **Trace cards** — every `withTrace` block as one drillable card with cost, latency, and event timeline
* **User filter pill** — narrow Overview / Models / Tools to one user

Full surface in [Tracing](/ai-apps/tracing). The per-user pattern in depth — with examples for Auth0, Clerk, NextAuth, custom JWT, and anonymous flows — lives at [per-user spend](/concepts/per-user-spend).

## Options

| Option         | Type                                | Default                                             | Purpose                                                                                                                                      |
| -------------- | ----------------------------------- | --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `voightApiKey` | string                              | env `VOIGHT_KEY`                                    | Your Voight key from the dashboard                                                                                                           |
| `agent`        | string                              | env `VOIGHT_AGENT` → `HOSTNAME` → `'unknown-agent'` | Stable identifier surfaced in the dashboard                                                                                                  |
| `apiBase`      | string                              | `https://api.voight.xyz`                            | Override for self-hosted deployments                                                                                                         |
| `privacy`      | `'minimal' \| 'standard' \| 'full'` | `'standard'`                                        | Capture aggressiveness                                                                                                                       |
| `sessionId`    | string                              | auto UUID v4                                        | Trace grouping. Stable across calls of one wrapper instance — events sharing a `sessionId` render as a single trace in the dashboard         |
| `enabled`      | boolean                             | `true`                                              | Kill switch — when false, returns the original client untouched (zero overhead)                                                              |
| `otel`         | boolean                             | `false`                                             | Emit captured calls as OpenTelemetry spans alongside the direct ingest. See [OpenTelemetry side-channel](#opentelemetry-side-channel) below. |

A missing or empty API key is non-fatal: the wrapper prints a one-line warning and returns the original client. Production keeps running.

## What's captured

| Signal                                                              | Field on the event                                                  |
| ------------------------------------------------------------------- | ------------------------------------------------------------------- |
| Model id (with version suffix)                                      | `model`                                                             |
| API surface used (chat completions vs responses)                    | `metadata.api` (`'responses'` on responses events; omitted on chat) |
| Prompt messages (chat)                                              | `input.messages`                                                    |
| Input payload (responses)                                           | `input.input`                                                       |
| Response text                                                       | `metadata.responseText`                                             |
| Token counts (input / output / total)                               | `metadata.tokens`                                                   |
| Cache reads (`prompt_tokens_details.cached_tokens`)                 | `metadata.tokens.cache_read`                                        |
| Reasoning tokens (Responses API, o1 / o3 / future reasoning models) | `metadata.tokens.reasoning`                                         |
| Tool / function calls (full array)                                  | `metadata.toolCalls`                                                |
| First tool's name (audit-log compat)                                | `toolExecuted`                                                      |
| Streaming flag                                                      | `metadata.streaming`                                                |
| Finish reason / response status                                     | `metadata.finishReason`                                             |
| Trace grouping                                                      | `metadata.sessionId`                                                |
| Trace ID (when inside `withTrace`)                                  | `metadata.traceId`                                                  |
| Route tag (when inside `withTrace`)                                 | `metadata.routeTag`                                                 |
| User / plan / org tags (when inside `withTrace`)                    | `metadata.tags`                                                     |
| Capture level used                                                  | `metadata.privacyLevel`                                             |
| Latency in milliseconds                                             | `durationMs`                                                        |
| Errors (re-thrown to the caller, recorded with `outcome: 'failed'`) | `errorMessage`, `outcome`                                           |

## Supported endpoints

The wrapper intercepts two paths:

* `client.chat.completions.create` — Chat Completions (non-streaming + streaming, tool calling)
* `client.responses.create` — Responses API (non-streaming + streaming, function calling, reasoning models)

Everything else on the OpenAI client passes through untouched. Embeddings, images, audio, and the Azure OpenAI client are on the 0.2.0 roadmap.

## Chat Completions

```ts theme={null}
const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hi' }],
})
```

Streaming works without setup. The wrapper auto-injects `stream_options.include_usage: true` so the final chunk carries the token count, and a per-index aggregator reassembles tool-call argument fragments.

```ts theme={null}
const stream = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  stream: true,
  messages: [{ role: 'user', content: 'count to five' }],
})

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '')
}
```

An explicit `stream_options: { include_usage: false }` from the caller is preserved — you opt out of token capture for streaming events but everything else is still captured.

## Responses API

The Responses API is OpenAI's surface for apps built after Mar 2025: typed output items, stateful conversations via `previous_response_id`, built-in tools, and explicit `reasoning_tokens` for o1 / o3 models.

```ts theme={null}
const response = await client.responses.create({
  model: 'gpt-4o-mini',
  input: 'Reply with: pong',
})
console.log(response.output_text)
```

Events from this surface carry `metadata.api: 'responses'` so dashboards can distinguish call sites from Chat Completions. Streaming is a typed event sequence (`response.created`, `response.output_text.delta`, `response.output_item.added`, `response.function_call_arguments.delta`, `response.completed`); the wrapper's state machine reacts to the critical events and passes the rest through unchanged.

```ts theme={null}
const stream = await client.responses.create({
  model: 'gpt-4o-mini',
  input: 'Count to five',
  stream: true,
})
for await (const event of stream) {
  if (event.type === 'response.output_text.delta') {
    process.stdout.write(event.delta)
  }
}
```

For reasoning models, `metadata.tokens.reasoning` separates the "thinking" overhead from the visible answer so cost analysis stays accurate.

## Tool / function calling

Works the same for both endpoints:

```ts theme={null}
const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: "what's the weather in Tokyo?" }],
  tools: [{
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get weather for a city',
      parameters: {
        type: 'object',
        properties: { location: { type: 'string' } },
        required: ['location'],
      },
    },
  }],
})
```

On the captured event:

* `toolExecuted: 'get_weather'` — first tool's name. Renders in the audit-log DETAIL column with the same shape as Bash/Edit hook events.
* `metadata.toolCalls: [{ id, name, arguments }]` — full array. `arguments` is the raw JSON string the model produced (we don't parse it — invalid JSON is a real failure mode you need to debug).

Streaming function calls work the same — fragment deltas across chunks (Chat Completions) or `function_call_arguments.delta` events (Responses) are concatenated per tool index.

## OpenTelemetry side-channel

By default the wrapper POSTs each captured call directly to `api.voight.xyz`. Set `otel: true` to **additionally** emit each call as an OpenTelemetry span — useful when the host process already runs an OTel pipeline (Langfuse, Phoenix, Datadog, Sentry, or [`@voightxyz/vercel-ai`](/ai-apps/vercel-ai)) and you want Voight events to appear there too.

```ts theme={null}
const client = wrapOpenAI(new OpenAI(), { agent: 'my-app', otel: true })
```

Each span is named `voight.openai.chat` (or `voight.openai.responses`) and carries the standard `gen_ai.*` semantic-convention attributes (`gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.usage.cache_read_input_tokens`, `gen_ai.response.finish_reasons`) plus the parallel Vercel-style `ai.*` namespace. The direct ingest path is unchanged — `otel: true` is purely additive.

### Dedup marker

Every emitted span carries `voight.source: 'wrapper'`. If you also use [`@voightxyz/vercel-ai`](/ai-apps/vercel-ai) ≥ 0.1.1 in the same process, that exporter skips wrapper-emitted spans automatically — no duplicate events in your dashboard.

### Optional peer dependency

`@opentelemetry/api` is now an optional peer dependency. If you never set `otel: true`, nothing changes. If you set `otel: true` but the package isn't installed, the wrapper logs a single warning and falls back to direct ingest only.

## Privacy

Three levels apply to prompts, response text, and tool-call arguments. The function name in `toolExecuted` is treated as a tag (not user content) and survives all levels.

| Level      | Prompts  | Response text | Tool arguments | `toolExecuted` (name) |
| ---------- | -------- | ------------- | -------------- | --------------------- |
| `minimal`  | dropped  | dropped       | dropped        | kept                  |
| `standard` | scrubbed | scrubbed      | scrubbed       | kept                  |
| `full`     | verbatim | verbatim      | verbatim       | kept                  |

Standard scrubs 12 patterns: PEM private keys, JWTs, Anthropic / OpenAI / Stripe live / GitHub / AWS / Slack / Voight API keys, emails, E.164 phones, and Luhn-validated credit cards. Token counts, model ids, and timing are NEVER scrubbed — they're numeric or tags, no PII risk.

See [PII patterns](/privacy/pii-patterns) for the full catalogue.

## How it compares

| Use case                                                              | Reach for                                    |
| --------------------------------------------------------------------- | -------------------------------------------- |
| Coding agent (Claude Code, Cursor, Codex) capturing your dev sessions | [Hooks-based SDK](/quickstart)               |
| Autonomous TS/JS bot you wrote yourself emitting custom events        | [Library mode](/sdk/library-mode)            |
| Production app calling OpenAI in user-facing flows                    | This package                                 |
| Production app calling Anthropic                                      | [`@voightxyz/anthropic`](/ai-apps/anthropic) |
| Per-user / per-tenant cost attribution in any of the above            | [Per-user spend](/concepts/per-user-spend)   |
| Anything else (Python, Go, Rust)                                      | [HTTP API](/sdk/http-api)                    |

The packages coexist — wrap your OpenAI client AND call `voight.log()` for your own domain events under the same agent. Adding [`withTrace`](/ai-apps/tracing) on top groups them all per-request.

## Source

* [github.com/Voightxyz/voight-openai](https://github.com/Voightxyz/voight-openai)
* [npmjs.com/package/@voightxyz/openai](https://www.npmjs.com/package/@voightxyz/openai)

## Roadmap

* Embeddings (`embeddings.create`)
* Image generation (`images.generate`)
* Audio (Whisper / TTS)
* Azure OpenAI client

See the [changelog](/changelog) for shipped releases.