PII Detection Reference

RADAR detects 50+ PII patterns across 7 categories with configurable confidence thresholds. This reference covers what is detected, how scoring works, and how to configure detection for your environment.

Detection Model#

PII detection runs on every agent prompt and LLM response that passes through RADAR. Detection combines regex pattern matching, format validation (Luhn checksum, IBAN structure), entity recognition (ML-based NER), and cross-reference validation for low false-positive rates. Each detection produces a finding with a confidence score and the matched context.

Confidence scoring

Each detection includes a confidence score (0.0–1.0) based on pattern match quality, context analysis, and cross-validation. Teams set per-category thresholds. Findings below threshold are recorded as info-level; findings above threshold generate actionable alerts.

PII Categories#

Financial

12 patterns · confidence 0.92–0.99

Credit card numbers (Visa, Mastercard, AmEx, Discover), bank account numbers, routing numbers, IBAN, SWIFT/BIC codes, CUSIP/SEDOL identifiers, tax IDs (EIN, VAT), invoice numbers

Personal identifiers

10 patterns · confidence 0.90–0.99

SSN, passport numbers, driver license numbers (US, EU, UK), national ID numbers (INSEE, CPF, NINO), voter IDs, biometric data references, citizenship numbers, residence permit numbers

Contact information

8 patterns · confidence 0.85–0.98

Email addresses, phone numbers (international), physical addresses (structured extraction), IP addresses, social media handles, URLs with personal context, mailing addresses

Health information

8 patterns · confidence 0.88–0.97

Medical record numbers (MRN), health plan IDs, diagnosis codes (ICD-10-CM), procedure codes (CPT, ICD-10-PCS), prescription details, lab results, clinical notes with patient context

Employment

6 patterns · confidence 0.85–0.95

Employee IDs, salary information, performance review data, disciplinary records, job offer details, background check data, termination records

Digital identity

8 patterns · confidence 0.93–0.99

API keys (AWS, OpenAI, Stripe, GitHub), access tokens, session cookies, OAuth tokens, encryption keys (PEM, SSH), database connection strings, cloud provider credentials, JWTs

Demographic & protected

6 patterns · confidence 0.80–0.93

Age with DOB, gender with clinical context, racial/ethnic data, religious affiliation, union membership, political opinions, sexual orientation, trade union membership

Configuration#

Configure confidence thresholds per category and set redaction behavior. Below-threshold findings are informational; above-threshold findings generate actionable alerts.

bash

# Set confidence threshold per category
docker exec radar radar pii threshold --category financial --min-confidence 0.95
docker exec radar radar pii threshold --category health --min-confidence 0.90
docker exec radar radar pii threshold --category digital_identity --min-confidence 0.90

# View current thresholds
docker exec radar radar pii thresholds
# Category         | Threshold | Alerts (24h)
# financial        | 0.95      | 12
# personal_id      | 0.90      | 45
# contact          | 0.85      | 89
# health           | 0.90      | 7
# employment       | 0.85      | 23
# digital_identity | 0.90      | 156
# demographic      | 0.80      | 3

Strict (0.95+)

Production environments where false positives have high operational cost. Only the most confident detections generate findings.

Standard (0.85+)

Balances detection coverage with false positive management. Recommended for most evaluation and production deployments.

Broad (0.70+)

Initial deployment sweep to identify all potential PII exposure points. Review findings per category and tighten thresholds iteratively.

Redaction Modes#

RADAR supports three redaction modes applied in-stream before the prompt reaches the model or before the response reaches the user.

Mask

Replace value with masked format. Full SSNs become "XXX-XX-1234". Credit cards become "4111-XXXX-XXXX-1234". Original value is not stored.

Hash

Replace with deterministic SHA-256 hash. Correlate occurrences across sessions without exposing raw PII. Useful for breach detection and pattern analysis.

Block

Prevent the request or response from being sent when PII exceeds configured severity. Recorded as a policy finding with escalation path.

bash

# Configure redaction
docker exec radar radar pii redact --mode mask --categories financial,health
docker exec radar radar pii redact --mode block --categories digital_identity

# Verify redaction rules
docker exec radar radar pii redact --status