What Is Personally Identifiable Information (PII)?
Personally identifiable information (PII) is information that, when used alone or with other relevant data, can identify an individual. PII may contain direct identifiers (e.g., passport information) that can identify a person uniquely, or quasi-identifiers (e.g., race) that can be combined with other quasi-identifiers (e.g., date of birth) to successfully recognize an individual.
Understanding Personally Identifiable Information (PII)
Advancing technology platforms have changed the way businesses operate, governments legislate, and individuals relate. With digital tools like cell phones, the Internet, e-commerce, and social media, there has been an explosion in the supply of all kinds of data.
Big data, as it is called, is being collected, analyzed, and processed by businesses and shared with other companies. The wealth of information provided by big data has enabled companies to gain insight into how to better interact with customers.
However, the emergence of big data has also increased the number of data breaches and cyber attacks by entities who realize the value of this information. As a result, concerns have been raised over how companies handle the sensitive information of their consumers. Regulatory bodies are seeking new laws to protect the data of consumers, while users are looking for more anonymous ways to stay digital.
- Personally identifiable information (PII) is information that, when used alone or with other relevant data, can identify an individual.
- Sensitive personally identifiable information can include your full name, Social Security Number, driver’s license, financial information, and medical records.
- Non-sensitive personally identifiable information is easily accessible from public sources and can include your zip code, race, gender, and date of birth.
Sensitive vs. Non-Sensitive PII
Personally identifiable information (PII) can be sensitive or non-sensitive. Sensitive personal information includes legal statistics such as:
- Full name
- Social Security Number (SSN)
- Driver’s license
- Mailing address
- Credit card information
- Passport information
- Financial information
- Medical records
The above list is by no means exhaustive. Companies that share data about their clients normally use anonymization techniques to encrypt and obfuscate the PII, so it is received in a non-personally identifiable form. An insurance company that shares its clients’ information with a marketing company will mask the sensitive PII included in the data and leave only information related to the marketing company’s goal.
Non-sensitive or indirect PII is easily accessible from public sources like phonebooks, the Internet, and corporate directories. Examples of non-sensitive or indirect PII include:
- Zip code
- Date of birth
- Place of birth
The above list contains quasi-identifiers and examples of non-sensitive information that can be released to the public. This type of information cannot be used alone to determine an individual’s identity.
However, non-sensitive information, although not delicate, is linkable. This means that non-sensitive data, when used with other personal linkable information, can reveal the identity of an individual. De-anonymization and re-identification techniques tend to be successful when multiple sets of quasi-identifiers are pieced together and can be used to distinguish one person from another.
Multiple data protection laws have been adopted by various countries to create guidelines for companies that gather, store, and share personal information of clients. Some of the basic principles outlined by these laws state that some sensitive information should not be collected unless for extreme situations.
Also, regulatory guidelines stipulate that data should be deleted if no longer needed for its stated purpose, and personal information should not be shared with sources that cannot guarantee its protection.
Cybercriminals breach data systems to access PII, which is then sold to willing buyers in underground digital marketplaces. For example, in 2015, the IRS suffered a data breach leading to the theft of more than a hundred thousand taxpayers’ PII. Using quasi-information stolen from multiple sources, the perpetrators were able to access an IRS website application by answering personal verification questions that should have been privy to the taxpayers only.
Regulating and safeguarding personally identifiable information will likely be a dominant issue for individuals, corporations, and governments in the years to come.
PII Around the World
The definition of what comprises PII differs depending on where you live in the world. In the United States, the government defined "personally identifiable" in 2007 as anything that can "be used to distinguish or trace an individual's identity" such as name, SSN, biometrics information—either alone or with other identifiers such as date of birth, or place of birth.
In the European Union (EU), the definition expands to include quasi-identifiers as outlined in the General Data Protection Regulation (GDPR) that went into effect in May 2018. The GDPR is a legal framework that sets rules for collecting and processing personal information for those residing in the EU.
Example of PII
In early 2018, Facebook Inc. (FB) was embroiled in a major data breach. The profiles of 50 million Facebook users were collected without their consent by an outside company called Cambridge Analytica as reported by The Guardian.
Cambridge Analytica got its data from Facebook through a researcher who worked at the University of Cambridge. The researcher built a Facebook app that was a personality quiz. An app is a software application used on mobile devices and websites.
The app was designed to take the information from those who volunteered to give access to their data for the quiz. Unfortunately, the app collected not only the quiz takers' data but, because of a loophole in Facebook's system, was able to also collect data from the friends and family members of the quiz takers.
As a result, over 50 million Facebook users had their data exposed to Cambridge Analytica without their consent. Although Facebook banned the sale of their data, Cambridge Analytica turned around and sold the data to be used for political consulting.
Mark Zuckerberg, Facebook founder, and CEO released a statement within the company's Q1-2019 earnings release:
We are focused on building out our privacy-focused vision for the future of social networking and working collaboratively to address important issues around the Internet.
The data breach not only affected Facebook users but investors as well. Facebook's profits decreased by 50% in Q1-2019 versus the same period a year earlier. The company accrued $3 billion in legal expenses and would have had an earnings per share of $1.04 higher without the expenses, stating:
We estimate that the range of loss in this matter is $3.0 billion to $5.0 billion. The matter remains unresolved, and there can be no assurance as to the timing or the terms of any final outcome.
Companies will undoubtedly invest in ways to harvest data such as personally identifiable information to offer products to consumers and maximize profits. However, regulating and safeguarding PII will likely be a dominant issue in the years to come.