§ 01 / How it flows

Your data stays on your computer. Claude only sees placeholders.

Pipeline: local → API → local

The document is anonymized locally; only placeholder text reaches Claude. When the model is done, PII is stitched back in — without ever leaving your disk.

Document

pdf · docx · txt

Has PII

PII Shield

anonymize

Local

Claude API

analyze placeholders

Remote

PII Shield

restore

Local

Result

analysis with real names

Has PII

PII never leaves your machine. Anonymization and restoration both happen locally.

Original on disk

John Smith signed the NDA.

Acme Corp. v. Widget Inc.

Contact: john@acme.com

Amount: €120,000 via IBAN CY17…

UK NIN: AB123456C · DOB 1984-06-12

What Claude sees over API

<PERSON_1> signed the NDA.

<ORG_1> v. <ORG_2>.

Contact: <EMAIL_1>

Amount: <MONEY_1> via IBAN <IBAN_1>

UK NIN: <UK_NIN_1> · DOB <DATE_1>

MCP call sequence

Document → Shield: anonymize_file({ file_path, session_id? }) → { session_id, entity_count, output_path, … }

Shield → Claude API: Claude reads output_path from the connected folder. Only placeholders cross the wire.

Claude API → Shield: model output flows back as text — still placeholder-only.

Shield → Result: deanonymize_docx({ file_path, session_id }) or deanonymize_text({ text, session_id }) stitches the real PII back in, locally.

§ 02 / Why this exists

The problem with "just use a model"

Pasting client material into a consumer LLM is both a privacy and a privilege problem.

A federal court has already confirmed the privilege side directly: in United States v. Heppner (SDNY, February 2026), documents a defendant created with consumer Claude were ruled outside attorney-client privilege. An AI tool is not an attorney and owes no duty of confidentiality.

PII Shield runs the redaction step before the model sees anything. It's an MCP server for Claude Desktop — drop a file, Claude receives placeholders like <PERSON_1> and <ORG_2>, and the real values are stitched back into the output on your machine. The original text never leaves your disk.

v1 → v2 · the rewrite

v1 was Python + Presidio + spaCy — every install fought Claude Desktop's bundled runtime. v2.0.2 is a complete Node.js rewrite over onnxruntime-node + @xenova/transformers: same detection coverage, no Python.

Plugin: 700 KB on Windows/Linux (host Node), or 83 MB on macOS (bundled Node 24.15.0). The 634 MB gliner-pii-base-v1.0 model installs on-demand via an in-chat panel — never bundled in the artefact. CI runs the matrix on Ubuntu/Windows/macOS × Node 18/20.

v1 is still public at gregmos/PII-Shield for the legacy crowd.

§ 03 / Skill modes

Six legal-document workflows out of the box

The companion pii-contract-analyze skill ships six task-shaped modes. Each runs the same anonymize → analyze → deanonymize pipeline, just optimized for a different deliverable.

MEMO

Risk analysis & legal memorandum.

"Review this NDA for red-flag clauses."

REDLINE

Tracked-changes contract markup with safer language.

"Redline clauses 4.2 and 7.1."

SUMMARY

Brief plain-English overview, key terms surfaced.

"Summarize this DPA in 200 words."

COMPARISON

Diff two documents — obligations and clauses, not whitespace.

"Compare v1 and v2 of this MSA."

BULK

Up to 5 files in one shared session — placeholders consistent across docs.

"Anonymize all 5 NDAs as one batch."

ANONYMIZE

Output a redacted file, no analysis. For external sharing.

"Just anonymize, don't analyze."

§ 04 / What it detects

National IDs, tax numbers, passports — and the obvious stuff too

17 country-specific patterns on top of generic entity detection. The fields a "find all emails" regex would never catch.

Jurisdictions EU + UK patterns

UK5 NIN · NHS · Passport · CRN · Driving licence
DE2 Tax ID · Social security
FR2 NIR · CNI
IT2 Fiscal code · VAT
ES2 DNI · NIE
CY2 TIC · National ID
EU2 VAT · Passport

Generic

person
org
location
email
phone
IBAN
credit card
crypto wallet
US SSN
US EIN
medical licence
money
date

Detection · NER

GLiNER zero-shot
person
org
location

Detection · pattern

all 17 jurisdiction patterns
email · phone
IBAN · credit card · crypto
SSN · EIN · licences
money · date

§ 05 / Quick start

Four steps. No terminal.

Download two files, drop them into Claude Desktop, connect a folder. The 634 MB GLiNER model installs itself in-chat on the first anonymization — no PowerShell, no bash, no scripts.

1

Download the artefacts

Pick the .mcpb for your OS plus the contract-analyze skill. Direct links to the v2.0.2 release ↗.

WIN · LINUX

pii-shield-v2.0.2-windows-linux.mcpb

700 KB host Node

MACOS

pii-shield-v2.0.2-macos.mcpb

83 MB bundled Node 24

SKILL · ANY OS

pii-contract-analyze.skill

25 KB contract analysis

2

Install the MCP extension

In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. The first time you use it, PII Shield will spend 2–3 minutes setting itself up. After that, it's instant.

In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. On first call, PII Shield runs npm ci --ignore-scripts to install pinned runtime deps (onnxruntime-node, @xenova/transformers, gliner) into ~/.pii_shield/deps/ — 2–3 minutes once per machine, instant thereafter.

3

Upload the skill

In Claude Desktop: Customize → Skills → + → Upload a skill and select pii-contract-analyze.skill. The skill orchestrates the full anonymize → review → analyze → deanonymize flow without you spelling out each step.

4

Use it from a chat

Start a new conversation, pick the pii-contract-analyze skill, connect a folder with your documents, then tell Claude what you need.

you    > analyze risks for the purchaser in contract.pdf
         and prepare a short memo
         [Skill: pii-contract-analyze · Folder: ~/Documents/contracts]
claude > // calls anonymize_file → sees <PERSON_1>, <ORG_1>…
         // HITL review — you confirm/edit detected entities
         // runs MEMO mode, drafts the memo
         // calls deanonymize_docx on the output
         here's contract-risks.memo.docx

First-run model install — handled in-chat

The first time you ask Claude to anonymize, PII Shield notices the model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser) and Install downloaded ZIP (PII Shield finds it in your Downloads folder automatically). Subsequent runs skip the panel.

The first time you ask Claude to anonymize, PII Shield notices the GLiNER model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser — no SmartScreen or Gatekeeper issues with unsigned scripts) and Install downloaded ZIP (PII Shield finds it in Downloads / Desktop / Documents, validates, atomic-extracts into ~/.pii_shield/models/). Subsequent runs skip the panel entirely.

Connect a folder. Don't drag-attach.

Use Claude Desktop → Settings → Connected folders to grant access. If you drag-attach a file directly into the chat, the raw document hits the API before PII Shield can intercept — and that defeats the entire design. Connected folders are read by the local plugin first.

Build & verify locally

npm ci --ignore-scripts --legacy-peer-deps
npm run build              # esbuild bundle
npm run smoke              # node scripts/smoke-protocol.mjs
npm test                   # 8 suites, incl. multi-doc HITL & session archival

CI runs the same matrix on Ubuntu / Windows / macOS × Node 18 / 20.

§ 06 / Compliance & audit

Three logs, all on-disk

Everything PII Shield does is recorded locally. No telemetry, no remote logging endpoint. If your DPO asks "what did the tool do with that file at 14:02 UTC," the answer is on your filesystem.

mcp_audit.log

Every tool call and response. Session lifecycle (create / extend / expire). The compliance trail your auditor will read first.

~/.pii_shield/audit/mcp_audit.log

ner_init.log

NER bootstrap and model-load events. Useful for first-run debugging and confirming which checkpoint version is active.

~/.pii_shield/logs/ner_init.log

pii_shield_server.log

MCP server lifecycle plus dropped stderr from the Node runtime. The catch-all when something feels off.

~/.pii_shield/logs/pii_shield_server.log

Encrypted handoff: export_session(passphrase) → .pii-session archive (AES-GCM + scrypt-derived key). Mappings live at ~/.pii_shield/mappings/, models at ~/.pii_shield/models/, audit trail appendable at ~/.pii_shield/audit/mcp_audit.log. Session TTL 7 days, configurable. All inputs validated with Zod (TypeScript strict). Off-network by design — no telemetry endpoint to disable.

§ 07 / Features

What's in the box

01

Zero PII in API

anonymize_file reads the document on your machine and returns only a file path + session id. Claude reads the anonymized file from disk — PII never enters an API request.

02

GLiNER zero-shot NER

gliner-pii-base-v1.0 over onnxruntime-node. Handles ALL-CAPS, domain-specific names, multilingual text. No Python, no PyTorch.

03

Human-in-the-loop review

MCP Apps iframe UI rendered directly in Claude Desktop. Remove false positives, add missed entities — no localhost browser detour.

04

Entity deduplication

"Acme" → <ORG_1>, "Acme Corp." → <ORG_1a>, "Acme Corporation" → <ORG_1b>. One canonical form; every variant maps back correctly on restore.

05

Cross-session deanonymize

Each anonymized .docx carries its session_id in Word custom properties. Weeks later, in a brand-new chat, drop the file in — PII is restored from the embedded id.

06

Multi-file sessions

Anonymize N related documents under one session_id; identical entities share placeholders across files. One deanonymize call restores PII everywhere.

07

Encrypted team handoff

export_session(passphrase) packs the mapping + anonymized documents into an encrypted .pii-session archive (AES-GCM via scrypt). Colleague runs import_session. PII never transits.

08

Audit logging

Every tool call + response logged locally to ~/.pii_shield/audit/. NER bootstrap, session lifecycle, dropped stderr — all on disk, appendable, off-network.

§ 08 / MCP tools reference

17 tools, four groups

Hover any tool name for a one-line description. Group by purpose; each tool is a single MCP function callable from Claude Desktop.

Anonymize4

anonymize_filecore

anonymize_text

anonymize_docx

anonymize_next_chunk

Review5

start_review

apply_review_overrides

get_full_anonymized_text

deanonymize_text

deanonymize_docx

Session3

export_session

import_session

get_mapping

Utility5

scan_textpreview

list_entities

find_file

resolve_path

install_model_from_download

Round-trip example

// 1. anonymize a file — original stays on disk; only placeholders cross the wire
→ anonymize_file({ file_path: "~/contracts/nda.docx" })
← { status: "success",
    session_id: "m9q8x4-7b5a91",
    entity_count: 14,
    output_path: ".../nda.anon.txt",
    docx_output_path: ".../nda.anon.docx",
    by_type: { PERSON: 4, ORG: 3, MONEY: 2, ... } }

// 2. Claude reads output_path and drafts a memo using placeholders only

// 3. restore PII into the draft, locally
→ deanonymize_text({ text: draft, session_id: "m9q8x4-7b5a91" })
← { status: "success",
    deanonymized_text: "...John Smith signed the NDA..." }

All inputs validated with Zod. Errors return MCP-protocol-shaped { isError: true, content: [...] } instead of throwing.

§ 09 / Stack

Built with

Node.js 18+ MCP (Model Context Protocol) GLiNER onnxruntime-node @xenova/transformers DOCX (pure JS) pdf-parse AES-GCM / scrypt MIT license

Tested on

Ubuntu Windows macOS Node 18 Node 20 8 test suites MCP protocol smoke Zod schema validation