Personally Identifiable Information (PII)

DEFINITION of 'Personally Identifiable Information (PII)'

Information that when used alone or with other relevant data can identify an individual. Personally Identifiable Information (PII) may contain direct identifiers (e.g. Passport information) that can identify a person uniquely, or quasi-identifiers (e.g. race) that can be combined with other quasi-identifiers (e.g. date of birth) to successfully recognize an individual.

BREAKING DOWN 'Personally Identifiable Information (PII)'

Nascent technology platforms have changed the way businesses operate, governments legislate, and individuals relate. With digital tools like cell phones, the internet, e-commerce, and social media, there has been an explosion in the supply of data of all kinds. Big Data, as it is called, is being collected, analyzed, and processed by businesses and shared with other companies. The wealth of information provided by Big Data has enabled companies to gain insight into how to better interact with customers. However, the emergence of Big Data has also increased the episodes of data breaches and cyberattacks by entities who realize how valuable the information is. This has raised concerns over how companies are handling sensitive information of their consumers. Regulatory bodies are seeking new laws to protect the data of consumers, while users are looking for more anonymous ways to stay digital.

Personally identifiable information (PII) can be sensitive or non-sensitive. Sensitive personal information includes stats like full name, Social Security Number (SSN), driver’s license, mailing address, credit card information, passport information, financial information. This is by no means an exhaustive list of what comprises PII. Companies that share data about their clients normally use anonymization techniques to encrypt and obfuscate the PII so it is received in a non-personally identifiable form. An insurance company that shares its clients’ information with a marketing company will mask the sensitive PII included in the data and leave only information related to the marketing company’s goal.

Non-sensitive or indirect PII is easily accessible from sources like phonebooks, the internet, and corporate directories. Zip code, race, gender, date of birth are all quasi-identifiers and examples of non-sensitive information that can be released to the public. This type of information cannot be used alone to determine an individual’s identity. Non-sensitive information, although not delicate, is linkable. This means that non-sensitive data, when used with other personal linkable information, can reveal the identity of an individual. De-anonymization and re-identification techniques tend to be successful when multiple sets of quasi-identifiers are pieced together and can be used to distinguish one person from another.

Cybercriminals breach data systems to access PII which is then sold to willing buyers in underground digital marketplaces. For example, in 2015 the IRS suffered a data breach leading to the theft of over a hundred thousand taxpayers’ PII. Using quasi information stolen from multiple sources, the perpetrators were able to access an IRS website application by answering personal verification questions that should have been privy to the taxpayers only.

Several data protection laws have been adopted by several countries in order to create guidelines for companies that gather, store, and share personal information of clients. Some of the basic principles outlined by these laws state that some sensitive information need not be collected unless for extreme situations; data should be deleted if no longer needed for stated purpose; and personal information should not be shared with sources that cannot guarantee its protection.