Open-source · Apache 2.0 · 35 entity types · runs locally
Make your AI data processing EU‑compliant
Detect and anonymize personal data before it enters your AI pipeline. Covers all PII defined by GDPR Art. 4(1), special categories under Art. 9(1), and meets the anonymization requirements of EU AI Act Art. 10(5). Built on bardsai/eu-pii-anonimization-multilang, an open-source XLM-RoBERTa token classifier running entirely in your browser via ONNX/WebAssembly — no data leaves your device.
Results preview
Detected Entities
| Entity | Type | Regulation | Confidence |
|---|---|---|---|
| Anna Wisniewska | PERSON_NAME | Art. 4(1) | 100% |
| IT | PERSON_ROLE_OR_TITLE | Art. 4(1) | 85% |
| 12,500 PLN | FINANCIAL_AMOUNT | Art. 4(1) | 91% |
| EP1234567 | ORGANIZATION_IDENTIFIER | Art. 4(1) | 64% |
| 527-020-1234 | ORGANIZATION_IDENTIFIER | Art. 4(1) | 100% |
| 22 Piekna Street, 00-549 | POSTAL_ADDRESS | Art. 4(1) | 100% |
| Warsaw | LOCATION | Art. 4(1) | 89% |
| Catholicism | RELIGION_OR_BELIEF | Art. 9(1) | 100% |
| NSZZ Solidarnoลฤ | TRADE_UNION_MEMBERSHIP | Art. 9(1) | 53% |
Entity taxonomy
35 entity types across 8 categories, mapped to specific GDPR articles. Art. 9(1) special categories (health, biometric, genetic, political, religious, ethnic, sexual orientation, trade union) are flagged separately to support higher-risk processing rules.
Personal Identity
GDPR Art. 4(1)PERSON_NAMEDATE_OF_BIRTHPERSON_ATTRIBUTEPERSON_ALIASPERSON_IDENTIFIER Organizations
GDPR Art. 4(1)ORGANIZATION_NAMEORGANIZATION_IDENTIFIER Contact & Location
GDPR Art. 4(1)EMAIL_ADDRESSPHONE_NUMBERCONTACT_HANDLEPOSTAL_ADDRESSLOCATIONGEO_LOCATION Technical Identifiers
GDPR Recital 30IP_ADDRESSDEVICE_IDENTIFIERCOOKIE_IDENTIFIERACCOUNT_IDENTIFIERAUTH_SECRET Financial
GDPR Art. 4(1)BANK_ACCOUNT_IDENTIFIERPAYMENT_CARDPAYMENT_CARD_SECURITYDOCUMENT_REFERENCEFINANCIAL_AMOUNTINCOME_COMPENSATIONVEHICLE_IDENTIFIER Health & Biometric
GDPR Art. 9(1)HEALTH_DATAGENETIC_DATABIOMETRIC_DATA Special Categories
GDPR Art. 9(1)RELIGION_OR_BELIEFPOLITICAL_OPINIONSEXUAL_ORIENTATIONTRADE_UNION_MEMBERSHIPETHNIC_ORIGINCRIMINAL_OFFENCE_DATA Employment
GDPR Art. 88PERSON_ROLE_OR_TITLE Regulatory context
This model addresses requirements from two EU regulations. Below are the specific provisions that PII detection and anonymisation tools help satisfy.
GDPR (Regulation 2016/679)
- Art. 4(1) — Personal data
- Defines the scope: any information relating to an identified or identifiable natural person. The 35 entity types in this model map directly to the identifiers listed here.
- Art. 5(1)(c) — Data minimisation
- Personal data must be adequate, relevant, and limited to what is necessary. Automated PII detection is a technical measure to enforce minimisation at scale.
- Art. 9(1) — Special categories
- Processing of health, biometric, genetic, racial/ethnic, political, religious, trade union, and sexual orientation data is prohibited by default. This model detects all eight special categories.
- Art. 25 — Data protection by design
- Controllers must implement appropriate technical measures, such as pseudonymisation, to implement data-protection principles. PII detection is a prerequisite for pseudonymisation.
- Art. 32(1)(a) — Security of processing
- Lists pseudonymisation and encryption as appropriate security measures. Automated PII detection enables systematic pseudonymisation of personal data.
EU AI Act (Regulation 2024/1689)
- Art. 10(5) — Data governance for high-risk AI
- Special-category personal data may only be used for bias detection when anonymised or synthetic data cannot fulfill the purpose. Requires “state-of-the-art pseudonymisation” when processing is necessary.
- Art. 59 — AI regulatory sandboxes
- Personal data in sandboxes may only be processed when anonymised or synthetic data is insufficient. Establishes an anonymisation-first principle for AI development.
- Recital 69 — Privacy throughout AI lifecycle
- Data minimisation and protection by design must be ensured throughout the entire AI lifecycle. Lists anonymisation and encryption as compliance measures.
- Annex III, §1 — High-risk: biometrics
- Remote biometric identification and categorisation systems based on sensitive attributes are classified as high-risk, requiring additional compliance obligations.
LLM inference without PII exposure
Send data to any third-party LLM safely. PII is replaced with indexed tokens before leaving your infrastructure, and restored in the final output.
Extract the invoice details: issued to John Kowalski, tax ID 527-020-1234, address 10/5 Marszalkowska Street, 00-001 Warsaw, account PL61 1090 1014 0000 0712 1981 2874, amount 12,500 PLN Extract the invoice details: issued to [PERSON_NAME_1], tax ID [ORGANIZATION_IDENTIFIER_1], address [POSTAL_ADDRESS_1], account [BANK_ACCOUNT_IDENTIFIER_1], amount [FINANCIAL_AMOUNT_1] {
"vendor": "[PERSON_NAME_1]",
"tax_id": "[ORGANIZATION_IDENTIFIER_1]",
"address": "[POSTAL_ADDRESS_1]",
"iban": "[BANK_ACCOUNT_IDENTIFIER_1]",
"amount": "[FINANCIAL_AMOUNT_1]"
} {
"vendor": "John Kowalski",
"tax_id": "527-020-1234",
"address": "10/5 Marszalkowska Street, ...",
"iban": "PL61 1090 ... 2874",
"amount": "12,500 PLN"
} [PERSON_NAME_1]โJohn Kowalski [ORGANIZATION_IDENTIFIER_1]โ527-020-1234 [POSTAL_ADDRESS_1]โ10/5 Marszalkowska Street, ... [BANK_ACCOUNT_IDENTIFIER_1]โPL61 1090 ... 2874 [FINANCIAL_AMOUNT_1]โ12,500 PLN GDPR-safe model training data
Prepare datasets that comply with EU AI Act Art. 10(5). Real PII is replaced with synthetic data โ the text structure stays intact for learning, but contains zero real personal information.
Patient Emil Nowak (national ID: 91082734567), living at 22 Lipowa Street, 30-702 Krakow, presented with chest pain. Attending physician: Dr. Anna Wisniewska. Patient Thomas Zielinski (national ID: 85120498321), living at 7 Debowa Street, 50-307 Wroclaw, presented with headache. Attending physician: Dr. Katarzyna Lewandowska. Text structure preserved. No real person identifiable. Compliant with GDPR Art. 89 and EU AI Act Art. 10(5). Emil NowakโThomas Zielinski 91082734567โ85120498321 22 Lipowa Street, ...โ7 Debowa Street, ... chest painโheadache Dr. Anna WisniewskaโDr. K. Lewandowska