> ## Documentation Index
> Fetch the complete documentation index at: https://docs.voight.xyz/llms.txt
> Use this file to discover all available pages before exploring further.

# PII patterns

> The 12 patterns and Luhn-validated card detection used by Standard mode.

When **Standard** mode is active, every string field in every event is run through [`scrubPii()`](https://github.com/Voightxyz/voight-sdk/blob/main/src/privacy.ts) before the SDK ships it. Patterns run in a deliberate order (most specific first, multi-line first). Matched substrings are replaced with a tagged token.

## Patterns

| Pattern                                          | Replacement              |
| ------------------------------------------------ | ------------------------ |
| PEM private key block                            | `[REDACTED-PRIVATE-KEY]` |
| JWT                                              | `[REDACTED-JWT]`         |
| Anthropic key (`sk-ant-...`)                     | `[REDACTED-API-KEY]`     |
| OpenAI key (`sk-...` / `sk-proj-...`)            | `[REDACTED-API-KEY]`     |
| Stripe live keys (`sk_live_...` / `pk_live_...`) | `[REDACTED-API-KEY]`     |
| GitHub fine-grained PAT (`github_pat_...`)       | `[REDACTED-API-KEY]`     |
| GitHub classic PAT (`ghp_...`)                   | `[REDACTED-API-KEY]`     |
| AWS access key (`AKIA...`)                       | `[REDACTED-API-KEY]`     |
| Slack token (`xoxb-` / `xoxp-` / `xoxa-`)        | `[REDACTED-API-KEY]`     |
| Voight key (`vk_...`)                            | `[REDACTED-API-KEY]`     |
| Email (`name@host.tld`)                          | `[REDACTED-EMAIL]`       |
| Phone, E.164 (`+1234567890`)                     | `[REDACTED-PHONE]`       |
| Credit card (13–19 digits + Luhn)                | `[REDACTED-CARD]`        |

Regex source: [`voight-sdk/src/privacy.ts`](https://github.com/Voightxyz/voight-sdk/blob/main/src/privacy.ts).

## Examples

```ts theme={null}
import { scrubPii } from '@voightxyz/sdk'

scrubPii(`curl -H "Authorization: Bearer sk-ant-api03-AbCd...X" https://api.example.com`)
// → curl -H "Authorization: Bearer [REDACTED-API-KEY]" https://api.example.com

scrubPii(`paid with 4242424242424242 — confirmation @ alice@example.com`)
// → paid with [REDACTED-CARD] — confirmation @ [REDACTED-EMAIL]

scrubPii(`auth=eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NSJ9.signature`)
// → auth=[REDACTED-JWT]

scrubPii(`call +14155552671 to confirm`)
// → call [REDACTED-PHONE] to confirm
```

## Deliberate non-matches

To avoid false positives that would harm dev workflows:

| Input                        | Not redacted   | Reason                           |
| ---------------------------- | -------------- | -------------------------------- |
| `email_template`             | not an email   | no `@` in middle                 |
| `support@app`                | not an email   | no TLD                           |
| `sk_test_xxx`                | not redacted   | Stripe test key — not real money |
| `AKIAfoo`                    | not an AWS key | real keys are uppercase          |
| 16-digit order numbers       | not a card     | fails Luhn checksum              |
| 10-digit numbers without `+` | not a phone    | E.164 requires `+` prefix        |

The test suite at [`tests/unit/privacy.test.ts`](https://github.com/Voightxyz/voight-sdk/blob/main/tests/unit/privacy.test.ts) verifies positive matches and adversarial negatives.

## Properties

* **Idempotent** — re-running `scrubPii()` on already-scrubbed text is stable.
* **Local** — runs in the SDK subprocess on your machine; no network or filesystem I/O.
* **Bounded** — designed for a 2KB event payload in under 10ms on a modern laptop. Regexes are anchored with word boundaries to avoid catastrophic backtracking.
* **Conservative** — the pattern set is small (\~13 regexes). Industry libraries like [gitleaks](https://github.com/gitleaks/gitleaks), [detect-secrets](https://github.com/Yelp/detect-secrets), and [trufflehog](https://github.com/trufflesecurity/trufflehog) ship 400+; we don't.

## Stricter than Standard

If Standard's scrubbing isn't strict enough for your case, **Minimal** drops content fields entirely: `reasoning`, `errorMessage`, `input`, `metadata.detail`, `metadata.response_preview`, `metadata.responseText`, `metadata.cwd`, `metadata.git`. What's left is tool names, timings, token counts, and identifiers.

See the [privacy overview](/privacy/overview) capture-level table.

## Requesting new patterns

Open an issue at [voightxyz/voight-sdk](https://github.com/Voightxyz/voight-sdk) if a credential format you care about isn't covered (crypto wallet private keys, mnemonic phrases, API tokens for AI providers we haven't added, etc.). We err conservative — a false positive is worse than a near-miss.
