Datawizz lets you configure endpoints to annonimize logs by removing personally identifiable information (PII) from the logs. This can be useful to ensure user privacy and comply with regulations like GDPR.

If there’s a high risk of users sending sensative PII to your AI system, you should consider enabling PII removal to ensure that this information is not stored in the logs. You can also select specific entity types to remove from logs to constrain PII removal scope.

Configuring PII Removal

To enable PII removal, turn on the Anonymize Logs option at the endpoint settings. You may also configure two additional options:

  • Score Threshold: The minimum confidence score required to remove an entity from the log. If the entity has a confidence score below this threshold, it will not be removed - this is a number between 0 and 1, with 0 being the lowest confidence and 1 being the highest.
  • Entities: a comma separated list of entity types to remove from the logs. If you leave this field blank, all entities will be removed. See supported entity types below.

List of supported entities

Global

Entity TypeDescriptionDetection Method
CREDIT_CARDA credit card number is between 12 to 19 digits. https://en.wikipedia.org/wiki/Payment_card_numberPattern match and checksum
CRYPTOA Crypto wallet number. Currently only Bitcoin address is supportedPattern match, context and checksum
DATE_TIMEAbsolute or relative dates or periods or times smaller than a day.Pattern match and context
EMAIL_ADDRESSAn email address identifies an email box to which email messages are deliveredPattern match, context and RFC-822 validation
IBAN_CODEThe International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors.Pattern match, context and checksum
IP_ADDRESSAn Internet Protocol (IP) address (either IPv4 or IPv6).Pattern match, context and checksum
NRPA person’s Nationality, religious or political group.Custom logic and context
LOCATIONName of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountainsCustom logic and context
PERSONA full person name, which can include first names, middle names or initials, and last names.Custom logic and context
PHONE_NUMBERA telephone numberCustom logic, pattern match and context
MEDICAL_LICENSECommon medical license numbers.Pattern match, context and checksum
URLA URL (Uniform Resource Locator), unique identifier used to locate a resource on the InternetPattern match, context and top level url validation

USA

Entity TypeDescriptionDetection Method
US_BANK_NUMBERA US bank account number is between 8 to 17 digits.Pattern match and context
US_DRIVER_LICENSEA US driver license according to https://ntsi.com/drivers-license-format/Pattern match and context
US_ITINUS Individual Taxpayer Identification Number (ITIN). Nine digits that start with a “9” and contain a “7” or “8” as the 4 digit.Pattern match and context
US_PASSPORTA US passport number with 9 digits.Pattern match and context
US_SSNA US Social Security Number (SSN) with 9 digits.Pattern match and context

UK

Entity TypeDescriptionDetection Method
UK_NHSA UK NHS number is 10 digits.Pattern match, context and checksum
UK_NINOUK National Insurance Number is a unique identifier used in the administration of National Insurance and tax.Pattern match and context

Spain

Entity TypeDescriptionDetection Method
ES_NIFA spanish NIF number (Personal tax ID) .Pattern match, context and checksum
ES_NIEA spanish NIE number (Foreigners ID card) .Pattern match, context and checksum

Italy

Entity TypeDescriptionDetection Method
IT_FISCAL_CODEAn Italian personal identification code. https://en.wikipedia.org/wiki/Italian_fiscal_codePattern match, context and checksum
IT_DRIVER_LICENSEAn Italian driver license number.Pattern match and context
IT_VAT_CODEAn Italian VAT code numberPattern match, context and checksum
IT_PASSPORTAn Italian passport number.Pattern match and context
IT_IDENTITY_CARDAn Italian identity card number. https://en.wikipedia.org/wiki/Italian_electronic_identity_cardPattern match and context

Poland

Entity TypeDescriptionDetection Method
PL_PESELPolish PESEL numberPattern match, context and checksum

Singapore

FieldTypeDescriptionDetection Method
SG_NRIC_FINA National Registration Identification CardPattern match and context
SG_UENA Unique Entity Number (UEN) is a standard identification number for entities registered in Singapore.Pattern match, context, and checksum

Australia

FieldTypeDescriptionDetection Method
AU_ABNThe Australian Business Number (ABN) is a unique 11 digit identifier issued to all entities registered in the Australian Business Register (ABR).Pattern match, context, and checksum
AU_ACNAn Australian Company Number is a unique nine-digit number issued by the Australian Securities and Investments Commission to every company registered under the Commonwealth Corporations Act 2001 as an identifier.Pattern match, context, and checksum
AU_TFNThe tax file number (TFN) is a unique identifier issued by the Australian Taxation Office to each taxpaying entityPattern match, context, and checksum
AU_MEDICAREMedicare number is a unique identifier issued by Australian Government that enables the cardholder to receive a rebates of medical expenses under Australia’s Medicare systemPattern match, context, and checksum

India

FieldTypeDescriptionDetection Method
IN_PANThe Indian Permanent Account Number (PAN) is a unique 12 character alphanumeric identifier issued to all business and individual entities registered as Tax Payers.Pattern match, context
IN_AADHAARIndian government issued unique 12 digit individual identity numberPattern match, context, and checksum
IN_VEHICLE_REGISTRATIONIndian government issued transport (govt, personal, diplomatic, defence) vehicle registration numberPattern match, context, and checksum
IN_VOTERIndian Election Commission issued 10 digit alpha numeric voter id for all indian citizens (age 18 or above)Pattern match, context
IN_PASSPORTIndian Passport NumberPattern match, Context

Finland

FieldTypeDescriptionDetection Method
FI_PERSONAL_IDENTITY_CODEThe Finnish Personal Identity Code (Henkilötunnus) is a unique 11 character individual identity number.Pattern match, context and custom logic.

Credits

This feature is powered by Microsoft’s Presidio framework.