pinned project
view

PII Shield

Anonymize documents before Claude sees them. Restore real data after analysis.

Built for Claude Desktop. No coding required.

v2.0.2 · Live MCP server Node.js 18+ Windows · macOS · Linux MIT
01how it flows

Your data stays on your computer. Claude only sees placeholders.

Pipeline: local → API → local

The document is anonymized locally; only placeholder text reaches Claude. When the model is done, PII is stitched back in — without ever leaving your disk.

Document
pdf · docx · txt
Has PII
PII Shield
anonymize
Local
Claude API
analyze placeholders
Remote
PII Shield
restore
Local
Result
analysis with real names
Has PII
PII never leaves your machine. Anonymization and restoration both happen locally.
Original on disk
John Smith signed the NDA.
Acme Corp. v. Widget Inc.
Contact: john@acme.com
Amount: €120,000 via IBAN CY17…
UK NIN: AB123456C · DOB 1984-06-12
What Claude sees over API
<PERSON_1> signed the NDA.
<ORG_1> v. <ORG_2>.
Contact: <EMAIL_1>
Amount: <MONEY_1> via IBAN <IBAN_1>
UK NIN: <UK_NIN_1> · DOB <DATE_1>
MCP call sequence

Document → Shield: anonymize_file({ file_path, session_id? }){ session_id, entity_count, output_path, … }

Shield → Claude API: Claude reads output_path from the connected folder. Only placeholders cross the wire.

Claude API → Shield: model output flows back as text — still placeholder-only.

Shield → Result: deanonymize_docx({ file_path, session_id }) or deanonymize_text({ text, session_id }) stitches the real PII back in, locally.

02why this exists

The problem with "just use a model"

Pasting client material into a consumer LLM is both a privacy and a privilege problem.

A federal court has already confirmed the privilege side directly: in United States v. Heppner (SDNY, February 2026), documents a defendant created with consumer Claude were ruled outside attorney-client privilege. An AI tool is not an attorney and owes no duty of confidentiality.

PII Shield runs the redaction step before the model sees anything. It's an MCP server for Claude Desktop — drop a file, Claude receives placeholders like <PERSON_1> and <ORG_2>, and the real values are stitched back into the output on your machine. The original text never leaves your disk.

v1 → v2 · the rewrite

v1 was Python + Presidio + spaCy — every install fought Claude Desktop's bundled runtime. v2.0.2 is a complete Node.js rewrite over onnxruntime-node + @xenova/transformers: same detection coverage, no Python.

Plugin: 700 KB on Windows/Linux (host Node), or 83 MB on macOS (bundled Node 24.15.0). The 634 MB gliner-pii-base-v1.0 model installs on-demand via an in-chat panel — never bundled in the artefact. CI runs the matrix on Ubuntu/Windows/macOS × Node 18/20.

v1 is still public at gregmos/PII-Shield for the legacy crowd.

03skill modes

Six legal-document workflows out of the box

The companion pii-contract-analyze skill ships six task-shaped modes. Each runs the same anonymize → analyze → deanonymize pipeline, just optimized for a different deliverable.

MEMO
Risk analysis & legal memorandum.
"Review this NDA for red-flag clauses."
REDLINE
Tracked-changes contract markup with safer language.
"Redline clauses 4.2 and 7.1."
SUMMARY
Brief plain-English overview, key terms surfaced.
"Summarize this DPA in 200 words."
COMPARISON
Diff two documents — obligations and clauses, not whitespace.
"Compare v1 and v2 of this MSA."
BULK
Up to 5 files in one shared session — placeholders consistent across docs.
"Anonymize all 5 NDAs as one batch."
ANONYMIZE
Output a redacted file, no analysis. For external sharing.
"Just anonymize, don't analyze."
04what it detects

National IDs, tax numbers, passports — and the obvious stuff too

17 country-specific patterns on top of generic entity detection. The fields a "find all emails" regex would never catch.

Jurisdictions EU + UK patterns
  • UK5 NIN · NHS · Passport · CRN · Driving licence
  • DE2 Tax ID · Social security
  • FR2 NIR · CNI
  • IT2 Fiscal code · VAT
  • ES2 DNI · NIE
  • CY2 TIC · National ID
  • EU2 VAT · Passport
Generic
  • person
  • org
  • location
  • email
  • phone
  • IBAN
  • credit card
  • crypto wallet
  • US SSN
  • US EIN
  • medical licence
  • money
  • date
Detection · NER
  • GLiNER zero-shot
  • person
  • org
  • location
Detection · pattern
  • all 17 jurisdiction patterns
  • email · phone
  • IBAN · credit card · crypto
  • SSN · EIN · licences
  • money · date
05quick start

Four steps. No terminal.

Download two files, drop them into Claude Desktop, connect a folder. The 634 MB GLiNER model installs itself in-chat on the first anonymization — no PowerShell, no bash, no scripts.

2
Install the MCP extension
In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. The first time you use it, PII Shield will spend 2–3 minutes setting itself up. After that, it's instant.
In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. On first call, PII Shield runs npm ci --ignore-scripts to install pinned runtime deps (onnxruntime-node, @xenova/transformers, gliner) into ~/.pii_shield/deps/ — 2–3 minutes once per machine, instant thereafter.
3
Upload the skill
In Claude Desktop: Customize → Skills → + → Upload a skill and select pii-contract-analyze.skill. The skill orchestrates the full anonymize → review → analyze → deanonymize flow without you spelling out each step.
4
Use it from a chat
Start a new conversation, pick the pii-contract-analyze skill, connect a folder with your documents, then tell Claude what you need.
you > analyze risks for the purchaser in contract.pdf and prepare a short memo [Skill: pii-contract-analyze · Folder: ~/Documents/contracts] claude > // calls anonymize_file → sees <PERSON_1>, <ORG_1>… // HITL review — you confirm/edit detected entities // runs MEMO mode, drafts the memo // calls deanonymize_docx on the output here's contract-risks.memo.docx
First-run model install — handled in-chat
The first time you ask Claude to anonymize, PII Shield notices the model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser) and Install downloaded ZIP (PII Shield finds it in your Downloads folder automatically). Subsequent runs skip the panel.
The first time you ask Claude to anonymize, PII Shield notices the GLiNER model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser — no SmartScreen or Gatekeeper issues with unsigned scripts) and Install downloaded ZIP (PII Shield finds it in Downloads / Desktop / Documents, validates, atomic-extracts into ~/.pii_shield/models/). Subsequent runs skip the panel entirely.
Connect a folder. Don't drag-attach.
Use Claude Desktop → Settings → Connected folders to grant access. If you drag-attach a file directly into the chat, the raw document hits the API before PII Shield can intercept — and that defeats the entire design. Connected folders are read by the local plugin first.
Build & verify locally
npm ci --ignore-scripts --legacy-peer-deps npm run build # esbuild bundle npm run smoke # node scripts/smoke-protocol.mjs npm test # 8 suites, incl. multi-doc HITL & session archival CI runs the same matrix on Ubuntu / Windows / macOS × Node 18 / 20.
06compliance & audit

Three logs, all on-disk

Everything PII Shield does is recorded locally. No telemetry, no remote logging endpoint. If your DPO asks "what did the tool do with that file at 14:02 UTC," the answer is on your filesystem.

mcp_audit.log
Every tool call and response. Session lifecycle (create / extend / expire). The compliance trail your auditor will read first.
~/.pii_shield/audit/mcp_audit.log
ner_init.log
NER bootstrap and model-load events. Useful for first-run debugging and confirming which checkpoint version is active.
~/.pii_shield/logs/ner_init.log
pii_shield_server.log
MCP server lifecycle plus dropped stderr from the Node runtime. The catch-all when something feels off.
~/.pii_shield/logs/pii_shield_server.log
07features

What's in the box

01
Zero PII in API
anonymize_file reads the document on your machine and returns only a file path + session id. Claude reads the anonymized file from disk — PII never enters an API request.
02
GLiNER zero-shot NER
gliner-pii-base-v1.0 over onnxruntime-node. Handles ALL-CAPS, domain-specific names, multilingual text. No Python, no PyTorch.
03
Human-in-the-loop review
MCP Apps iframe UI rendered directly in Claude Desktop. Remove false positives, add missed entities — no localhost browser detour.
04
Entity deduplication
"Acme" → <ORG_1>, "Acme Corp." → <ORG_1a>, "Acme Corporation" → <ORG_1b>. One canonical form; every variant maps back correctly on restore.
05
Cross-session deanonymize
Each anonymized .docx carries its session_id in Word custom properties. Weeks later, in a brand-new chat, drop the file in — PII is restored from the embedded id.
06
Multi-file sessions
Anonymize N related documents under one session_id; identical entities share placeholders across files. One deanonymize call restores PII everywhere.
07
Encrypted team handoff
export_session(passphrase) packs the mapping + anonymized documents into an encrypted .pii-session archive (AES-GCM via scrypt). Colleague runs import_session. PII never transits.
08
Audit logging
Every tool call + response logged locally to ~/.pii_shield/audit/. NER bootstrap, session lifecycle, dropped stderr — all on disk, appendable, off-network.
08mcp tools

17 tools, four groups

Hover any tool name for a one-line description. Group by purpose; each tool is a single MCP function callable from Claude Desktop.

Anonymize4
anonymize_filecore
anonymize_text
anonymize_docx
anonymize_next_chunk
Review5
start_review
apply_review_overrides
get_full_anonymized_text
deanonymize_text
deanonymize_docx
Session3
export_session
import_session
get_mapping
Utility5
scan_textpreview
list_entities
find_file
resolve_path
install_model_from_download
Round-trip example
// 1. anonymize a file — original stays on disk; only placeholders cross the wire anonymize_file({ file_path: "~/contracts/nda.docx" }) { status: "success", session_id: "m9q8x4-7b5a91", entity_count: 14, output_path: ".../nda.anon.txt", docx_output_path: ".../nda.anon.docx", by_type: { PERSON: 4, ORG: 3, MONEY: 2, ... } } // 2. Claude reads output_path and drafts a memo using placeholders only // 3. restore PII into the draft, locally deanonymize_text({ text: draft, session_id: "m9q8x4-7b5a91" }) { status: "success", deanonymized_text: "...John Smith signed the NDA..." }

All inputs validated with Zod. Errors return MCP-protocol-shaped { isError: true, content: [...] } instead of throwing.

09stack

Built with

Node.js 18+ MCP (Model Context Protocol) GLiNER onnxruntime-node @xenova/transformers DOCX (pure JS) pdf-parse AES-GCM / scrypt MIT license
Tested on
Ubuntu Windows macOS Node 18 Node 20 8 test suites MCP protocol smoke Zod schema validation