pinned project
view

PII Shield

Anonymize documents before Claude sees them. Restore real data after analysis.

Anonymize documents before any LLM sees them. Restore real data after analysis.

Built for Claude Desktop. No coding required.

surfaces

MCP plugin for Claude Desktop. CLI for any other LLM. Same engine, same session, same disk. Anonymize on one surface, deanonymize on the other — sessions are portable.

v2.1.0 · Live MCP server CLI Node.js 22+ Windows · macOS · Linux MIT
$ npm install -g pii-shield
01how it flows

Your data stays on your computer. Claude only sees placeholders.

Pipeline: local → API → local

The document is anonymized locally; only placeholder text reaches Claude. When the model is done, PII is stitched back in — without ever leaving your disk.

The document is anonymized locally; only placeholder text reaches the LLM (Claude via the MCP plugin, or any other model via the CLI — ChatGPT, Gemini, local Llama, internal gateway). When the model is done, PII is stitched back in — without ever leaving your disk.

Document
pdf · docx · txt
Has PII
PII Shield
anonymize
Local
Claude API
LLM API
analyze placeholders
claude · gpt · gemini · llama
Remote
PII Shield
restore
Local
Result
analysis with real names
Has PII
PII never leaves your machine. Anonymization and restoration both happen locally.
Original on disk
John Smith signed the NDA.
Acme Corp. v. Widget Inc.
Contact: john@acme.com
Amount: €120,000 via IBAN CY17…
UK NIN: AB123456C · DOB 1984-06-12
What Claude sees What the LLM sees over API
<PERSON_1> signed the NDA.
<ORG_1> v. <ORG_2>.
Contact: <EMAIL_1>
Amount: <MONEY_1> via IBAN <IBAN_1>
UK NIN: <UK_NIN_1> · DOB <DATE_1>
MCP call sequence

Document → Shield: anonymize_file({ file_path, session_id? }){ session_id, entity_count, output_path, … }

Shield → Claude API: Claude reads output_path from the connected folder. Only placeholders cross the wire.

Claude API → Shield: model output flows back as text — still placeholder-only.

Shield → Result: deanonymize_docx({ file_path, session_id }) or deanonymize_text({ text, session_id }) stitches the real PII back in, locally.

Demo · 1:40 · auto-plays muted when in view
02why this exists

The problem with "just use a model"

Pasting client material into a consumer LLM is both a privacy and a privilege problem.

A federal court has already confirmed the privilege side directly: in United States v. Heppner (SDNY, February 2026), documents a defendant created with consumer Claude were ruled outside attorney-client privilege. An AI tool is not an attorney and owes no duty of confidentiality. The same principle generalizes to ChatGPT, Gemini, and every other consumer LLM.

PII Shield runs the redaction step before the model sees anything. It's an MCP plugin for Claude Desktop — drop a file, Claude receives placeholders like <PERSON_1> and <ORG_2>, and the real values are stitched back into the output on your machine. The original text never leaves your disk.

PII Shield runs the redaction step before any model sees anything. Two surfaces: an MCP plugin for Claude Desktop (drop a file, Claude receives placeholders like <PERSON_1>, real values stitched back on save) and a CLI for any other LLM (anonymize → paste into ChatGPT, Gemini, local Llama, internal gateway → deanonymize on return). Same engine, same on-disk session format. The original text never leaves your disk.

v1 → v2 · the rewrite

v1 was Python + Presidio + spaCy — every install fought Claude Desktop's bundled runtime. v2.1.0 is a complete Node.js rewrite over onnxruntime-node + @xenova/transformers: same detection coverage, no Python.

Plugin: 700 KB on Windows/Linux (host Node), or 83 MB on macOS (bundled Node 24.15.0). The 634 MB gliner-pii-base-v1.0 model installs on-demand via an in-chat panel — never bundled in the artefact. CI runs the matrix on Ubuntu/Windows/macOS × Node 22/24.

Both v1.0.0 and v2.1.0 release tags live in gregmos/PII-Shield — the v1 Python build remains downloadable for legacy installs.

03skill modes

Six legal-document workflows out of the box

The companion pii-contract-analyze skill ships six task-shaped modes. Each runs the same anonymize → analyze → deanonymize pipeline, just optimized for a different deliverable.

MEMO
Risk analysis & legal memorandum.
"Review this NDA for red-flag clauses."
REDLINE
Tracked-changes contract markup with safer language.
"Redline clauses 4.2 and 7.1."
SUMMARY
Brief plain-English overview, key terms surfaced.
"Summarize this DPA in 200 words."
COMPARISON
Diff two documents — obligations and clauses, not whitespace.
"Compare v1 and v2 of this MSA."
BULK
Up to 5 files in one shared session — placeholders consistent across docs.
"Anonymize all 5 NDAs as one batch."
ANONYMIZE
Output a redacted file, no analysis. For external sharing.
"Just anonymize, don't analyze."
04quick start

Four steps. No terminal.

Download two files, drop them into Claude Desktop, connect a folder. The 634 MB GLiNER model installs itself in-chat on the first anonymization — no PowerShell, no bash, no scripts.

Three commands. Same engine.

For terminal users, scripting, CI gates, and round-trips through any LLM. Sessions, mappings, and the model live in ~/.pii_shield/ — shared with the MCP plugin.

surface

Not sure? → MCP, CLI, or both?

2
Install the MCP extension
In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. The first time you use it, PII Shield will spend 2–3 minutes setting itself up. After that, it's instant.
In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. On first call, PII Shield runs npm ci --ignore-scripts to install pinned runtime deps (onnxruntime-node, @xenova/transformers, gliner) into ~/.pii_shield/deps/ — 2–3 minutes once per machine, instant thereafter.
3
Upload the skill
In Claude Desktop: Customize → Skills → + → Upload a skill and select pii-contract-analyze.skill. The skill orchestrates the full anonymize → review → analyze → deanonymize flow without you spelling out each step.
4
Use it from a chat
Start a new conversation, pick the pii-contract-analyze skill, connect a folder with your documents, then tell Claude what you need.
you > analyze risks for the purchaser in contract.pdf and prepare a short memo [Skill: pii-contract-analyze · Folder: ~/Documents/contracts] claude > // calls anonymize_file → sees <PERSON_1>, <ORG_1>… // HITL review — you confirm/edit detected entities // runs MEMO mode, drafts the memo // calls deanonymize_docx on the output here's contract-risks.memo.docx
First-run model install — handled in-chat
The first time you ask Claude to anonymize, PII Shield notices the model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser) and Install downloaded ZIP (PII Shield finds it in your Downloads folder automatically). Subsequent runs skip the panel.
The first time you ask Claude to anonymize, PII Shield notices the GLiNER model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser — no SmartScreen or Gatekeeper issues with unsigned scripts) and Install downloaded ZIP (PII Shield finds it in Downloads / Desktop / Documents, validates, atomic-extracts into ~/.pii_shield/models/). Subsequent runs skip the panel entirely.
Connect a folder. Don't drag-attach.
Use Claude Desktop → Settings → Connected folders to grant access. If you drag-attach a file directly into the chat, the raw document hits the API before PII Shield can intercept — and that defeats the entire design. Connected folders are read by the local plugin first.
1
Install the npm package
Node.js 22 or newer required (node -v).
npm install -g pii-shield pii-shield --version
2
Download the GLiNER model
One-off, ~634 MB. Survives npm uninstall; lives at ~/.pii_shield/models/.
pii-shield install-model # add --yes for non-interactive (CI)
3
Health-check, then run
First anonymize or scan takes ~1–2 min while the engine deps install (deterministic, cached). Subsequent runs are instant.
pii-shield doctor # green checks across the board pii-shield anonymize contract.pdf --no-review # → contract_anonymized.txt with <PERSON_1>, <ORG_1>, … # → Session: 2026-04-29_120000_ab12
First-run engine deps
On the first anonymize or scan call, the engine installs onnxruntime-node, @xenova/transformers, and gliner at pinned versions into ~/.pii_shield/deps/installs/<hash>/. About 300 MB, 1–2 min, deterministic. Everything (model, deps, mappings, audit logs) survives npm uninstall -g pii-shield.

Full reference for every flag, env var, and exit code: → CLI command reference

Build & verify locally (from source)
npm ci --ignore-scripts --legacy-peer-deps npm run build # MCP server (esbuild bundle) npm run build:cli # CLI binary npm run smoke # node scripts/smoke-protocol.mjs npm test # 8 suites, incl. multi-doc HITL & session archival CI runs the same matrix on Ubuntu / Windows / macOS × Node 22 / 24.

→ Full install reference (configuration, paths, advanced flags)

05mcp, cli, or both?

One engine. Two front-ends.

Same 33 entity types, same session format on disk, same ~/.pii_shield/. Pick the front-end that matches how you work — sessions exported from one open in the other.

Use MCP
You work in Claude Desktop and want a natural-language flow. The skill picks the right tool calls; you stay in the chat.
"Drop a contract. Ask Claude to find the indemnity clauses. Restore on save."
Use CLI
Scripting, CI/CD, non-Claude LLMs (ChatGPT / Gemini / local models), batch jobs across many files, headless servers.
pii-shield verify ./out as a compliance gate; round-trip via terminal.
Use both
Mixed teams; portable sessions across machines. Anonymize on a partner's laptop, review on yours — the same encrypted archive.
CLI sessions export → transfer → MCP import_session.

→ Full reference for both surfaces

06features

What's in the box

01
Zero PII in API
anonymize_file reads the document on your machine and returns only a file path + session id. Claude reads the anonymized file from disk — PII never enters an API request.
02
GLiNER zero-shot NER
gliner-pii-base-v1.0 over onnxruntime-node. Handles ALL-CAPS, domain-specific names, multilingual text. No Python, no PyTorch.
03
Human-in-the-loop review
MCP Apps iframe UI rendered directly in Claude Desktop. Remove false positives, add missed entities — no localhost browser detour.
04
Entity deduplication
"Acme" → <ORG_1>, "Acme Corp." → <ORG_1a>, "Acme Corporation" → <ORG_1b>. One canonical form; every variant maps back correctly on restore.
05
Cross-session deanonymize
Each anonymized .docx carries its session_id in Word custom properties. Weeks later, in a brand-new chat, drop the file in — PII is restored from the embedded id.
06
Multi-file sessions
Anonymize N related documents under one session_id; identical entities share placeholders across files. One deanonymize call restores PII everywhere.
07
Encrypted team handoff
export_session(passphrase) packs the mapping + anonymized documents into an encrypted .pii-session archive (AES-GCM via scrypt). Colleague runs import_session. PII never transits.
08
Audit logging
Every tool call + response logged locally to ~/.pii_shield/audit/. NER bootstrap, session lifecycle, dropped stderr — all on disk, appendable, off-network.
07full documentation

Every flag. Every command. Every env var.

Configuration, full command reference for both surfaces, workflows, HITL walkthrough, Python integration, troubleshooting — on a separate docs page so this one stays readable.