Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.voight.xyz/llms.txt

Use this file to discover all available pages before exploring further.

Z.ai (rebranded from Zhipu AI in 2025) publishes the GLM model family — GLM-5, GLM-4.7, GLM-4.5, plus vision variants and OCR. Voight ships pricing entries for every model in the official Z.ai catalogue, so events from GLM calls land in your dashboard with the same cost / token / latency capture as any other provider. The Z.ai API at https://api.z.ai/api/paas/v4 is officially OpenAI-compatible, which means the same Voight SDK family that wraps OpenAI / Anthropic also covers GLM — no separate package needed.

Three setup paths

Pick the one that matches how your app calls GLM:
Your stackPath
Express / vanilla Node calling Z.ai via the OpenAI SDKDirect wrapper
Next.js / any app using the Vercel AI SDK + zhipu-ai-providerVercel AI SDK exporter
Autonomous bot / library code emitting events directlyLibrary mode

Direct wrapper (@voightxyz/openai)

Because Z.ai is OpenAI-compatible, @voightxyz/openai works zero-code — just point the OpenAI client at Z.ai’s base URL:
import OpenAI from 'openai'
import { wrapOpenAI } from '@voightxyz/openai'

const glm = wrapOpenAI(
  new OpenAI({
    baseURL: 'https://api.z.ai/api/paas/v4',
    apiKey: process.env.ZHIPU_API_KEY,
  }),
  {
    voightApiKey: process.env.VOIGHT_KEY,
    agent: 'production-chat-api',
    privacy: 'standard',
  },
)

const result = await glm.chat.completions.create({
  model: 'glm-4.5',
  messages: [{ role: 'user', content: 'Hello' }],
})
Every call captures prompts, tokens, response text, tool calls, finish reason, and latency. The cost is computed from the Voight backend’s MODEL_PRICING table using the GLM rates listed below.

Vercel AI SDK (@voightxyz/vercel-ai)

If your app uses the Vercel AI SDK with the community zhipu-ai-provider, @voightxyz/vercel-ai captures every streamText / generateText / streamObject / generateObject call automatically via OpenTelemetry — same as for OpenAI or Anthropic targets.
import { registerOTel } from '@vercel/otel'
import { VoightExporter } from '@voightxyz/vercel-ai'

export function register() {
  registerOTel({
    serviceName: 'my-app',
    traceExporter: new VoightExporter({
      agent: 'production-chat-api',
      privacy: 'standard',
    }),
  })
}
import { streamText } from 'ai'
import { zhipu } from 'zhipu-ai-provider'

export async function POST(req: Request) {
  const result = streamText({
    model: zhipu('glm-4.7'),
    prompt: (await req.json()).prompt,
    experimental_telemetry: {
      isEnabled: true,
      metadata: { userId: 'user_123', plan: 'pro' },
    },
  })
  return result.toAIStreamResponse()
}
The metadata.userId lifts onto metadata.tags.userId in Voight and powers the dashboard’s per-user spend sub-tab.

Library mode (@voightxyz/sdk)

For autonomous agents or library callers that don’t use either of the above SDKs, the library-mode client is provider-agnostic — call GLM however you like and emit a Voight event manually:
import { Voight } from '@voightxyz/sdk'

const voight = new Voight({
  apiKey: process.env.VOIGHT_KEY,
  agentId: 'my-glm-bot',
})

const t0 = Date.now()
const res = await fetch('https://api.z.ai/api/paas/v4/chat/completions', {
  method: 'POST',
  headers: {
    'authorization': `Bearer ${process.env.ZHIPU_API_KEY}`,
    'content-type': 'application/json',
  },
  body: JSON.stringify({
    model: 'glm-4.5',
    messages: [{ role: 'user', content: 'Hello' }],
  }),
})
const data = await res.json()

voight.log({
  type: 'reasoning',
  model: 'glm-4.5',
  durationMs: Date.now() - t0,
  outcome: 'success',
  metadata: {
    source: 'custom',
    provider: 'zhipu',
    tokens: {
      input: data.usage.prompt_tokens,
      output: data.usage.completion_tokens,
    },
    responseText: data.choices[0].message.content,
  },
})
The cost matcher resolves glm-4.5 to the table entry below; the family tag zhipu keeps the event grouped with other GLM traffic in the dashboard.

Pricing coverage

Verified against the official Z.ai pricing page (USD per 1M tokens).

Text models

ModelInputOutput
GLM-5.1$1.40$4.40
GLM-5$1.00$3.20
GLM-5-Turbo$1.20$4.00
GLM-4.7$0.60$2.20
GLM-4.7-FlashX$0.07$0.40
GLM-4.7-FlashFreeFree
GLM-4.6$0.60$2.20
GLM-4.5$0.60$2.20
GLM-4.5-X$2.20$8.90
GLM-4.5-Air$0.20$1.10
GLM-4.5-AirX$1.10$4.50
GLM-4.5-FlashFreeFree
GLM-4-32B-0414-128K$0.10$0.10

Vision models

ModelInputOutput
GLM-5V-Turbo$1.20$4.00
GLM-4.5V$0.60$1.80
GLM-4.6V$0.30$0.90
GLM-4.6V-FlashX$0.04$0.40
GLM-4.6V-FlashFreeFree
GLM-OCR$0.03$0.03
Longest-prefix lookup handles overlapping family names — e.g. an event tagged model: 'glm-4.5-airx-20251130' matches glm-4.5-airx (not glm-4.5-air or glm-4.5) and bills at the correct rate.

Resources