GDPR: Pseudonymization as an Alternative to Encryption

Have I mentioned lately that the General Data Protection Regulation (GDPR) is a complicated law? Sure, there are some underlying principles, such as Privacy by Design (PbD) and other ideas,...

Michael Buckbee

3 min read

Last updated October 20, 2021

Have I mentioned lately that the General Data Protection Regulation (GDPR) is a complicated law? Sure, there are some underlying principles, such as Privacy by Design (PbD) and other ideas, that once you understand, the whole thing makes more sense. But there are plenty of surprises when you delve into the legalese. For example, pseudonymization.

What’s that?

Get the Free Essential Guide to US Data Protection Compliance and Regulations

Before we explain pseudonymization, it’s important to understand that the EU GDPR covers only personal data. It’s what we in the US would call personally identifiable information (PII). Think names, addresses, phone numbers, account numbers, and more recently email and IP addresses.

What Personal Data Is and Isn’t

What if you remove these personal data identifiers from, say, a spreadsheet or report or some other file contents?

You’re freed from official GDPR regulations (and fines).

Nor would you have to implement all the Privacy by Design principles– minimization, retention plans—and security safeguards for this non-personal data.

Your company’s intellectual property — software, business plans for world domination, and other IP — also doesn’t fall under the GDPR.

Yes, this means that if hackers steal the marketing plans for the next big product launch, you don’t have to report the incident to the local data protection authority or DPA

Of course, a company would still want to take data protection measures — may we recommend an inside out approach? — for its content, but the GDPR doesn’t require it!

Encryption is a Non-solution

Back to personal data. Most companies can’t simply strip it from their content. Although, it’s not a bad idea to minimize the files where this personal data appears.

One way to deal with content that contains personal data and lessen some of the burdens of the GDPR is to encrypt it.

Under the GDPR, unlike the older Data Protection Directive, encrypting data does give you some benefits. It’s explicitly mentioned as a legitimate way to address the security of processing personal data—one of the law’s key requirements.

Companies that encrypt their personal data also gain the advantage of not having to notify data subjects in the case of a breach. (They still, though, would have to notify the local DPA.)

Is encryption a cheap trick allowing you to avoid some of the EU GDPR rules?

It’s not a bad trick and it does have some advantages, but it ain’t cheap.

In our IOS philosophy, we consider encryption as a possible, but very impractical solution to securing file data. Simply put: wholesale encryption of files containing personal data would make it very difficult or almost impossible for employees to get their work done.

As we’ve been saying all along, the file system is where employees keep and share the content (spreadsheets, documents, presentations) that they’re working on now. It’s their virtual desks, and adding a layer of encryption is liking moving things around and making their desk even sloppier — no one likes that!— as well as being administratively difficult to manage.

Pseudonymization: Replacing Identifiers With Codes

And this finally brings us to pseudonymization.

It’s a GDPR-approved technique for encoding personal data in order to reduce some of the burdens of this law.

The idea is to replace personal identifiers with a random code. It’s the same idea behind writers using pseudonyms to hide their identities. The GDPR says you can do this on a larger scale as a way to lessen some of the GDPR requirements.

Generally, there would have to be an intake system that would process the raw data identifiers and convert them to these special codes. And there would have to be a master table that maps the codes back into the real identifiers for those processes that need the original information.

Using this approach, employees could then work with pseudonymized files in which the identities of the data subjects would be hidden. The rest of the file, of course, would be readable.

Partial encryption is maybe one way to think about this technique.

Like encryption, pseudonymization is considered a security protection measure (see article 32) and it’s also explicitly mentioned as a “data by protection by design and by default” or PbD technique (see article 25). It’s also considered a personal data minimization technique — very important to the GDPR.

And for data breaches, it appears to this non-attorney that data subjects would not necessarily have to be notified if their pseudonymized data were stolen. Technically, it depends on whether there’s enough quasi-identifiers in the stolen file — remember zip code, birth date, and gender — to allow the hackers to re-identify the data subject.

In any case, there are other advantages to pseudonyms. The IAPP folks – a non-profit for privacy pros that has written extensively on the GDPR — has a great blog post explaining more about it.

And I’ll be writing more about this topic in a future post.

What you should do now

Below are three ways we can help you begin your journey to reducing data risk at your company:

Schedule a demo session with us, where we can show you around, answer your questions, and help you see if Varonis is right for you.
Download our free report and learn the risks associated with SaaS data exposure.
Share this blog post with someone you know who'd enjoy reading it. Share it with them via email, LinkedIn, Reddit, or Facebook.

Michael Buckbee Michael has worked as a sysadmin and software developer for Silicon Valley startups, the US Navy, and everything in between.