All posts by Andy Green

Data Security 2017: We’re All Hacked

Data Security 2017: We’re All Hacked

Remember more innocent times back in early 2017? Before Petya, WannaCry, leaked NSA vulnerabilities, Equifax, and Uber, the state of data security was anything but rosy, but I suppose there was more than a few of us left — consumers and companies — who could say that security incidents did not have a direct impact.

That has changed after Equifax’s massive breach affecting 145 million American adults — I was a victim — and then a series of weaponized ransomware attacks that held corporate data hostage on a global scale.

Is there any major US company that hasn’t been affected by a breach?

Actually, ahem, no.

According to security researcher Mikko Hyponnen, all 500 of the Fortune 500 have been hacked. He didn’t offer evidence, but another cybersecurity research company has some tantalizing clues. A company called DarkOwl scans the dark web for stolen PII and other data, and traces it back to the source. They have strong evidence that all of the Fortune 500 have had data exposed at some point.

We Had Been Warned

Looking over past IOS blog posts, especially for this last year, I see the current massive breach pandemic as completely expected.

Back in 2016, we spoke with Ken Munro, UK’s leading IoT pen tester. After I got over the shock of learning that WiFi coffee makers and Internet-connected weighing scales actually exist, Munro explained that Security by Design is not really a prime directive for IoT gadget makers.

Or as he put it, “You’re making a big step there, which is assuming that the manufacturer gave any thought to an attack from a hacker at all.”

If you read a post from his company’s blog from October 2015 about hacking into an Internet-connected camera, you’ll see all the major ingredients of a now familiar pattern:

  1.  Research vulnerability or (incredibly careless) backdoor in IoT gadget, router, or software;
  2. Take advantage of an exposed external ports to scan for suspect hardware or software;
  3. Enter target system from the Internet and inject malware; and
  4. Hack system, and then spread the malware in worm-like fashion.

This attack pattern (with some variation) was used successfully in 2016 by Mirai, and in 2017 by Pinkslipbot and WannaCry.

WannaCry, though, introduced two new features not seen in classic IoT hacks: an unreported vulnerability – aka Eternal Blue – taken from the NSA’s top-secret TAO group and, of course, ransomware as the deadly payload.

Who could have anticipated that NSA code would make its way to the bad guys who then use it in for their evil attack?

Someone was warning us about that as well!

In January 2014, Cindy and I heard crypto legend Bruce Schneier talk about data security post-Snowden. Schneier warned us that the NSA wouldn’t be able to keep it ssecrets and that eventually their code would leak or would be re-engineered by hackers. And that is exactly what happened with  WannaCry.

Here are Schneier’s wise words:

“We know that technology democratizes. Today’s secret NSA program, becomes tomorrow’s PhD thesis, becomes the next day’s hacker tool.”

Schneier also noted that many of the NSA’s tricks are based on simply getting around cryptography and perimeter defenses. In short, the NSA hackers were very good at finding ways to exploit our bad habits in choosing weak passwords, not keeping patches up to date, or not changing default settings.

It ain’t advanced cryptography (or even rocket science).

In my recent chat with Wade Baker, the former Verizon DBIR lead, I was reminded of this KISS (keep it simple,stupid) principle, but he had the hard statistical evidence to back it up. Wade told me most attacks are not sophisticated, but take advantage of unforced user errors.

Unfortunately, even in 2017, companies are still learning how to play the game. If you want a prime example of a simple attack, you have only to look at 2017’s massive Equifax breach, which was the result of a well-known bug in the company’s Apache Struts, which remained  unpatched!

Weapons of Malware Destruction

Massive ransomware attacks was the big security story of 2017 — Petya, WannaCry, and NotPetya. By the way we offered some practical advice on dealing with NotPetya, the Petya variant that was spread through a watering hole — downloaded from a website of a Ukrainian software company.

There are similarities in all of the aforementioned ransomwares: all exploited Eternal Blue and spread using either internal or open external ports. The end result was the same – encrypted files for which companies have to pay ransom in the form of some digital currency.

Ransomware viruses ain’t new either. Old timers may remember the AIDs Trojan, which was DOS-based ransomware spread by sneaker-net.

The big difference, of course, is that this current crop of ransomware can lock up entire file systems  — not just individual C drives — and automatically spreads over the Internet or within an organization.

These are truly WMD – weapons of malware destruction. All the ingredients were in place, and it just took enterprising hackers to weaponize the ransomware


One area of malware that I believe will continue to be a major headache for IT security is file-less PowerShell and FUD attacks. We wrote a few posts on both these topics in 2017.

Sure there’s nothing new here as well — file-less or malware-free hacking has been used by hackers for years. Some of the tools and techniques have been productized for, cough, pen testing purposes, and so it’s now far easier for anyone to get their hands on these gray tools.

The good news is that Microsoft has made it easier to log PowerShell script execution to spot abnormalities.

The whole topic of whitelisting apps has also picked up speed in recent years. We even tried our own experiments in disabling PowerShell using AppLocker’s whitelisting capabilities. Note: it ain’t easy.

Going forward, it looks like Windows 10 Device Guard offers some real promise in preventing rogue malware from running using whitelisting techniques.

The more important point, though, is that security researchers recognize that the hacker will get in, and the goal should be to make it harder for them to run their apps.

Whitelisting is just one aspect of mitigating threats post-exploitation.

Varonis Data Security Platform can help protect data on the inside and notify you when there’s been a breach. Learn more today!

[Video] Varonis GDPR Risk Assessment

[Video] Varonis GDPR Risk Assessment

Are you ready for GDPR? According to our survey of 500 IT and risk management decision makers, three out of four are facing serious challenges in achieving compliance when GDPR becomes effective on May 25 2018. Varonis can help.

A good first step in preparing for GDPR is identifying where EU personal data resides in the file system, and then checking that access permissions are set appropriately. But wait, EU personal data identifiers span 28 member countries, encompassing different formats for license plate numbers, national id cards, passport ids, bank accounts, and more.

That’s where our GDPR Patterns can help! We’ve researched and hand-crafted over 150 GDPR classification expressions to help you discover the EU personal data in your systems, and analyze your exposure.

To learn more, watch this incredibly informative video and sign up today for our GDPR Risk Assessment.


Interview With Wade Baker: Verizon DBIR, Breach Costs, & Selling Board...

Interview With Wade Baker: Verizon DBIR, Breach Costs, & Selling Boardrooms on Data Security

Wade Baker is best known for creating and leading the Verizon Data Breach Investigations Report (DBIR). Readers of this blog are familiar with the DBIR as our go-to resource for breach stats and other practical insights into data protection. So we were very excited to listen to Wade speak recently at the O’Reilly Data Security Conference.

In his new role as partner and co-founder of the Cyentia Institute, Wade presented some fascinating research on the disconnect between CISOs and the board of directors. In short: if you can’t relate data security spending back to the business, you won’t get a green-light on your project.

We took the next step and contacted Wade for an IOS interview. It was a great opportunity to tap into his deep background in data breach analysis, and our discussion ranged over the DBIR, breach costs, phishing, and what boards look for in security products. What follows is a transcript based on my phone interview with Wade last month.

Inside Out Security: The Verizon Data Breach Investigations Report (DBIR) had been incredibly useful to me in understanding the real-world threat environment. I know one of the first things that caught my attention was that — I think this is pretty much a trend for the last five or six years — external threats or hackers certainly far outweigh insiders.

Wade Baker: Yeah.

IOS: But you’ll see headlines that say just the opposite, the numbers flipped around —‘like 70% of attacks are caused by insiders’. I was wondering if you had any comments on that and perhaps other data points that should be emphasized more?

WB: The whole reason that we started doing the DBIR in the first place, before it was ever a report, is just simply…I was doing a lot of risk-assessment related consulting. And it always really bothered me that I would be trying to make a case, ‘Hey, pay attention to this,’ and I didn’t have much data to back it up.

But there wasn’t really much out there to help me say, ‘This thing on the list is a higher risk because it’s, you know, much more likely to happen than this other thing right here.’

Interesting Breach Statistics

WB: Anyone who’s done those lists knows there’s a bunch of things on this list. When we started doing that, it was kind of a simple notion of, ‘All right, let me find a place where that data might exist, forensic investigations, and I’ll decompose those cases and just start counting things.’

Attributes of incidents, and insiders versus outsiders is one I had always heard —- like you said. Up until that point, 80% of all risk or 80% of all security incidents are insiders. And it’s one of those things that I almost consider it like doctrine at that time in the industry!

When we showed pretty much the exact opposite! This is the one stat that I think has made people the most upset out of my 10 years doing that report!

People would push back and kind of argue with things, but that is the one, like, claws came out on that one, like, ‘I can’t believe you’re saying this.’

There are some nuances there. For instance, when you study data breaches, then it does. Every single data set I ever looked at was weighted toward outsiders.

When you study all security incidence — no matter what severity, no matter what the outcome — then things do start leaning back toward insiders. Just when you consider all the mistakes and policy violations and, you know, just all that kind of junk.

Social attacks and phishing have been on the rise in recent years. (Source: Verizon DBIR)

IOS: Right, yes.

WB: I think defining terms is important, and one reason why there’s disagreement. Back to your question about other data points in the report that I love.

The ones that show the proportion of breaches that tie back to relatively simple attacks, which could have been thwarted by relatively cheap defenses or processes or technologies.

I think we tend to have this notion — maybe it’s just an excuse — that every attack is highly sophisticated and every fix is expensive. That’s just not the case!

The longer we believe those kind of things, I think we just sit back and don’t actually do the sometimes relatively simple stuff that needs to be done to address the real threat.

I love that one, and I also love the time to the detection. We threw that in there almost as a whim, just saying, ‘It seems like a good thing to measure about a breach.’

We wanted to see how long it takes, you know, from the time they start trying to link to it, and from the time they get inside to the time they find data, and from the time they find the data to exfiltrating it. Then of course how long it takes to detect it.

I think that was some of the more fascinating findings over the years, just concerning that.

IOS: I’m nodding my head about the time to discovery. Everything we’ve learned over the last couple of years seems to validate that. I think you said in one of your reports that the proper measurement unit is months. I mean, minimally weeks, but months. It seems to be verified by the bigger hacks we’ve heard about.

WB: I love it because many other people started publishing that same thing, and it was always months! So it was neat to watch that measurement vetted out over multiple different independent sources.

Breach Costs

IOS: I’m almost a little hesitant to get into this, but recently you started measuring breach cost based o proprietary insurance data. I’ve been following the controversy.

Could you just talk about it in general and maybe some of your own thoughts on the disparities we’ve been seeing in various research organizations?

WB: Yeah, that was something that for so long, because of where we got our information, it was hard to get all of the impact side out of a breach. Because you do a forensic investigation, you can collect really good info about how it happened, who did it, and that kind of thing, but it’s not so great six months or a year down the road.

You’re not still inside that company collecting data, so you don’t get to see the fallout unless it becomes very public (and sometimes it does).

We were able to study some costs — like the premier, top of line breach cost stats you always hear about from Ponemon.

IOS: Yes.

WB: And I’ve always had some issues with that, not to get into throwing shade or anything. The per record cost of a breach is not a linear type equation, but it’s treated like that.

What you get many times is something like an Equifax, 145 million records. Plus you multiply that by $198 per record, and we get some outlandish cost, and you see that cost quoted in the headlines. It’s just not how it works!

There’s a decreasing cost per record as you get to larger breaches, which makes sense.

There are other factors there that are involved. For instance, I saw a study from RAND, by Sasha Romanosky recently, where after throwing in predictors like company revenue and whether or not they’ve had a breach before — repeat offenders so to speak — and some other factors, then she really improves the cost prediction in the model.

I think those are the kind of things we need to be looking at and trying to incorporate because I think the number of records is probably, at best, describes about a third … I don’t even know if it gets to a half of the cost on the breach.

Breach costs do not have a linear relationship with data records! (Source: 2015 Verizon DBIR)

IOS: I did look at some of these reports andI’m a little skeptical about the number of records itself as a metric because it’s hard to know this, I think.

But if it’s something you do on a per incident basis, then the numbers look a little bit more comparable to Ponemon.

Do you think it’s a problem, looking at it on per record basis?

WB: First of all, an average cost per record, I would like to step away from that as a metric, just across the board.  But tying cost to the number of records probably…I mean, it works better for, say, consumer data or payment card data or things like that where the costs are highly associated with the number of people affected. You then get into cost of credit monitoring and the notifications. All of those type things are certainly correlated to how many people or consumers are affected.

When you talk about IP or other types of data, there’s just almost no correlation. How do you count a single stolen document as a record? Do you count megabytes? Do you count documents?

Those things have highly varied value depending on all kinds of circumstances. It really falls down there.

What Boards Care About

IOS: I just want to get back to your O’Reilly talk. And one of the things that also resonated with me was the disconnect between the board and the CISOs who have to explain investments. And you talk about that disconnect.

I was looking at your blog and Cyber Balance Sheet reports, and you gave some examples of this — something that the CISO thinks is important, the board is just saying, ‘What?’

So I was wondering if you can mention one or two examples that would give some indication of this gap?

WB: The CISOs have been going to the board probably for several rounds now, maybe years, presenting information, asking for more budgets, and the board is trying to ‘get’ what they need to build a program to do the right things.

Pretty soon, many boards start asking, ‘When are we done? We spent money on security last month. Why are we doing it this quarter too?’

Security as a continual and sometimes increasing investment is different than a lot of other things that they look at. They think of, ‘Okay, we’re going to spend money on this project, get it done, and we’re going to have this value at the end of that.’

We can understand those things, but security is just not like that. I’ve seen it a lot this breaking down with CISOs, who are coming from, ‘We need to do this project.’

You lay on top of all this that the board is not necessarily going to see the fruits of their investment in security! Because if it works, they don’t see anything bad at all.

Another problem that CISOs have is ‘how do I go to them when we haven’t had any bad things happen, and asking for more money?’ It’s just a conversation where you should be prepared to say why that is —  connect these things to the business.

By doing these things, we’re enabling these pieces of the business to function properly. It’s a big problem, especially for more traditional boards that are clearly focused on driving revenue and other areas of the business.

IOS: Right. I’m just thinking out loud now … Is the board comparing it to physical security, where I’m assuming you make this initial investment in equipment, cameras, and recording and whatever, and then your costs, going forward, are mostly people or labor costs?

They probably are looking at it and saying,  ‘Why am I spending more? Why am I buying more cameras or more modern equipment?’

WB: I think so! I’ve never done physical security, other than as a sideline to information security. Even if there are continuing costs, they live in that physical world. They can understand why, ‘Okay, we had a break-in last month, so we need to, I don’t know, add a guard gate or something like that.’ They get why and how that would help.

Whereas in the logical or cyber security world, they sometimes really don’t understand what you’re proposing, why it would work. If you don’t have their trust, they really start trying to poke holes. Then if you’re not ready to answer the question, things just kind of go downhill from there.

They’re not going to believe that the thing you’re proposing is actually going to fix the problem. That’s a challenge.

IOS: I remember you mentioning during your O’Reilly talk that helpful metaphors can be useful, but it has to be the right metaphor.

WB: Right.

IOS: I mean, getting back to the DBIR. In the last couple of years, there was an uptick in phishing. I think probably this should enter some of these conversations because it’s such an easy way for someone to get inside. For us at Varonis, we’re been focused on ransomware lately, and there’s also DDoS attacks as well.

Will these new attack shift the board’s attention to something they can really understand—-since these attacks actually disrupt operations?

WB: I think it can because things like ransomware and DDoS, are things that are apparent just kind of in and of themselves. If they transpire, then it becomes obvious and there are bad outcomes.

Whereas more cloak-and dagger stealing of intellectual property or siphoning a bunch of consumer data is not going to become apparent, or if it is, it’s months down the road, like we talked about earlier.

I think these things are attention-getters within a company, attention-getters from the headlines. I mean, from what I’ve heard over the past year, as this ransomware has been steadily increasing, it has definitely received the board’s attention!

I think it is a good hook to get in there and show them what they’re doing. And ransomware is a good one because it has a corporate aspect and a personal aspect.

You can talk to the board about, ‘Hey, you know, this applies to us as a company, but this is a threat to you in your laptop in your home as well. What about all those pictures that you have? Do you have those things backed up? What if they got on your data at home?’

And then walk through some of the steps and make it real. I think it’s an excellent opportunity for that. It’s not hype, it’s actually occurring and top of the list in many areas!

Contrary to conventional wisdom, corporate board of directors understand the value of data protection. (Source: Cyber Balance Sheet)

IOS: This brings something else to mind. Yes, you could consider some of these breaches as a cost of doing business, but if you’re allowing an outsider to get access to all your files, I would think, high-level executives would be a little worried that they could find their emails. ‘Well, if they can get in and steal credit cards, then they can also get into my laptop.’

I would think that alone would get them curious!

WB: To be honest, I have found that most of the board members that I talk to, they are aware of security issues and breaches much more than they were five to ten years ago. That’s a good thing!

They might sit on boards of other companies, and we’ve had lots of reporting of the chance that a board member has been with a company that’s experienced a breach or knows a buddy who has, is pretty good by now. So it’s a real problem in their mind!

But I think the issue, again, is how do you justify to them that the security program is making that less likely? And many of them are terrified of data breaches, to be honest.

Going back to that Cyber Balance Sheet report, I was surprised when we asked board members what is the biggest value that security provides — you know, kind of the inverse of your biggest fear? They all said preventing data breaches. And I would have thought they’d say, ‘Protect the brand,’ or ‘Drive down risk,’ or something like that. But they answered, ‘Prevent data breaches.’

It just shows you what’s at the top of their minds! They’re fearful of that and they don’t want that to happen. They just don’t have a high degree of trust that the security program will actually prevent them.

IOS: I have to say, when I first started at Varonis, some of these data breach stories were not making the front page of The New York Times or The Washington Post, and that certainly has changed. You can begin to understand  the fear. Getting back to something you said earlier about how simple approaches, or as we call it block-and-tackle, can prevent breaches.

Another way to mitigate the risk of these breaches is something that you’ve probably heard of, Privacy by Design, or Security by Design. One of the principles is just simply reduce the data that can cause the risk.

Don’t collect as much, don’t store as much, and delete it when it’s no longer used. Is that a good argument to the board?

WB: I do, and I think there are several approaches. I’ve given this recommendation fairly regularly, to be honest: minimize the data that you’re collecting. Because I think a lot of companies don’t need as much data as they’re collecting! It’s just easy and cheap to collect it these days, so why not?

Helping organizations understand that it is a risk decision! Tthat’s not just a cost decision. It is important. And then of what you collect, how long do you retain it?

Because the longer you retain it and the more you collect, you’re sitting on a mountain of data and you can become a target of criminals just through that fact.
For the data that you do have and you do need to retain … I’m a big fan of trying to consolidate it and not let it spread around the environment.

One of the metrics I like to propose is, ‘Okay, here’s the data that’s important to me. We need to protect it.’ Ask people where that lives or how many systems that should be stored on in the environment, and then go look for it.

If you can multiply that number by like 3 or 5 or 10 sometimes. And that’s the real answer! It’s a good metric to strive for: the number of target systems that that information should reside within. many breaches come from areas where that should not have been.

Security Risk Metrics

IOS: That leads to the next question about risk metrics. One we use at Varonis is PII data that has Windows permissions marked for Everyone. They’re always surprised during assessments when they see how large it is.

This relates to stale data. It could be, you know, PII data that hasn’t been touched in a while. It’s sitting there, as you mentioned.  No one’s looking at it, except the hackers who will get in and find it!

Are there other good risk metrics specifically related to data?

WB: Yup, I like those. You mentioned phishing a while ago. I like stats such as the number of employees that will click-through, say, if you do a phishing test in the organization. I think that’s always kind of an eye-opening one because boards and others can realize that, ‘Oh, okay. That means we got a lot of people clicking, and there’s really no way we can get around that, so that forces us to do something else.’

I’m a fan of measuring things like number of systems compromised in any given time, and then the time that it takes to clean those up and drive those two metrics down, with a very focused effort over time, to minimize them. You mentioned people that have…or data that has Everyone access.

Varonis stats on loosely permissioned folders.

IOS: Yes.

WB: I always like to know, whether it’s a system or an environment or a scope, how many people have admin access! Because we highly over-privileged in most security environments.

I’ve seen eyes pop, where people say, ‘What? We can’t possibly have that many people that have that level of need to know on…for that kind of thing.’ So, yeah, that’s a few off the top of my head.

IOS: Back to phishing. I interviewed Zinaida Benenson a couple months ago — she presented at Black Hat. She did some interesting research on phishing and click rates. Now, it’s true that she looked at college students, but the rates were  astonishing. It was something like 40% were clicking on obvious junk links in Facebook messages and about 20% in email spam.

She really feels that someone will click and it’s just almost impossible to prevent that in an organization. Maybe as you get a little older, you won’t click as much, but they will click.

WB: I’ve measured click rates at about 23%, 25%. So 20% to 25% in organizations. And not only in organizations, but organizations that paid to have phishing trials done. So I got that data from, you know, a company that provides us phishing tests.

You would think these would be the organizations that say, ‘Hey, we have a problem, I’m aware. I’m going to the doctor.’ Even among those, where one in four are clicking. By the time an attacker sends 10 emails within the organization, there’s like a 99% rate that someone is going to click.

Students will click on obvious spammy links. (Source: Zinaida Benenson’s 2016 Black Hat presentation)

IOS: She had some interesting things to say about curiosity and feeling bold. Some people, when they’re in a good mood, they’ll click more.

I have one more question on my list …  about whether data breaches are a cost of business or are being treated as a cost of business.

WB: That’s a good one.

IOS: I had given an example of shrinkage in retail as a cost of business. Retailers just always assume that, say, there’s a 5% shrinkage. Or is security treated — I hope it will be treated — differently?

WB: As far as I can tell, we do not treat it like that. But I’ll be honest, I think treating it a little bit like that might not be a bad thing! In other words, there have been some studies that look at the losses due to breaches and incidents versus losses like shrinkage and other things that are just very, very common, and therefore we’re not as fearful of them.

Shrinkage takes many, many more…I can’t remember what the…but it was a couple orders of magnitude more, you know, for a typical retailer than data breaches.

We’re much more fearful of breaches, even at the board level. And I think that’s because they’re not as well understood and they’re a little bit newer and we haven’t been dealing with it.

When you’re going to have certain losses like that and they’re fairly well measured, you can draw a distribution around them and say that I’m 95% confident that my losses are going be within this limit.

Then that gives you something definite to work with, and you can move on. I do wish we could get there with security, where we figure out that, ‘All right, I am prepared to lose this much.”

Yes, we may have a horrifying event that takes us out of that, and I don’t want to have that. We can handle this, and we handle that through these ways. I think that’s an important maturity thing that we need to get to. We just don’t have the data to get there quite yet.

IOS: I hear what you’re saying. But there’s just something about security and privacy that may be a little bit different …

WB: There is. There certainly is! The fact that security has externalities where it’s not just affecting my company like shrinkage. I can absorb those dollars. But my failures may affect other people, my partners, consumers and if you’re in critical infrastructure, society. I mean that makes a huge difference!

IOS: Wade, this has been an incredible discussion on topics that don’t get as much attention as they should.

Thanks for your insights.

WB: Thanks Andy. Enjoyed it!

Do Your GDPR Homework and Lower Your Chance of Fines

Do Your GDPR Homework and Lower Your Chance of Fines

Advice that was helpful during your school days is also relevant when it comes to complying with the General Data Protection Regulation (GDPR): do your homework because it counts for part of your grade! In the case of the GDPR, your homework assignments involve developing and implementing privacy by design measures, and making sure these policies are published and known about by management.

Taking good notes and doing homework assignments came to my mind when reading the new guideline published last month on GDPR fines. Here’s what the EU regulators have to say:

Rather than being an obligation of goal, these provisions introduce obligations of means, that is, the controller must make the necessary assessments and reach the appropriate conclusions. The question that the supervisory authority must then answer is to what extent the controller “did what it could be expected to do” given the nature, the purposes or the size of the processing, seen in light of the obligations imposed on them by the Regulation’

The supervising authority referenced above is what we used to call the data protection authority or DPA, which is in charge of enforcing the GDPR in an EU country. So the supervising authority is supposed to ask the controller, EU-speak for the company collecting the data, whether they did their homework — “expected to do” — when determining fines involved in a GDPR complaint.

Teachers Know Best

There are other factors in this guideline that affect the level of fines, including the number of data subjects, the seriousness of the damage (“risks to rights and freedoms”), the categories of data that have been accessed, and willingness to cooperate and help the supervisory authority. You could argue that some of this is out of your control once the hackers have broken through the first level of defenses.

But what you can control is the effort a company has put into their security program to limit the security risks.

I’m also reminded of what Hogan Lovells’ privacy attorney Sue Foster told us during an interview about the importance of “showing your work”.  In another school-related analogy, Foster said you can get “partial credit” if you show that to the regulators after an incident that you have security processes in place.

She also predicted we’d get more guidance and that’s what the aforementioned document does: explains what factors are taken into account when issuing fines in GDPR’s two-tiered system of either 2% or 4% of global revenue. Thanks Sue!

Existing Security Standards Count

The guideline also contains some very practical advice on compliance. Realizing that many companies are already rely on existing data standards, such as ISO 27001, the EU regulators are willing to give some partial credit if you follow these standards.

… due account should be taken of any “best practice” procedures or methods where these exist and apply. Industry standards, as well as codes of conduct in the respective field or profession are important to take into account. Codes of practice might give indication of the level of knowledge about different means to address typical security issues associated with the processing.

For those who want to read the fine print in the GDPR, they  can refer to article 40 (“Codes of Conduct”). In short it says that standards associations can submit their security controls, say PCI DSS, to the European Data Protection Board (EDPB) for approval. If a controller then follows an officially approved “code of conduct”, then this can dissuade the supervising authority from taking actions, including issuing fines, as long as the standards group — for example, the PCI Security Standards Council — has its own monitoring mechanism to check on compliance.

Based on this particular GDPR guideline, it will soon be the case that those who have done the homework of being PCI compliant will be in a better position to deal with EU regulators.

Certifiably GDPR

The GDPR, though, goes a step further. It leaves open a path to official certification of a controller’s data operations!

In effect, the supervising authorities have the power (through article 40) to certify a controller’s operations as GDPR compliant. The supervising authority itself can also accredit other standards organization to issue these certifications as well.

In any case, the certifications will expire after three years at which point the company will need to re-certify.

I should add these certifications are entirely voluntary, but there’s obvious benefits to many companies. The intent is to leverage the private sector’s existing data standards, and give companies a more practical approach to compliance with the GDPR’s technical and administrative requirements.

The EDPB is also expected to develop certification marks and seals for consumers, as well as a registry of certified companies.

We’ll have to wait for more details to be published by the regulators on GDPR certification.

In the short term, companies that already have programs in place to comply with PCI DSS, ISO 27001, and other data security standards should potentially be in a better position with respect to GDPR fines.

And in the very near future, a “European Data Protection Seal” might just become a sought after logo on company web sites.

Want to reduce your GDPR fines? Varonis helps support many different data security standards. Find out more!

[Podcast] Privacy Attorney Tiffany Li and AI Memory, Part II

[Podcast] Privacy Attorney Tiffany Li and AI Memory, Part II

This article is part of the series "[Podcast] Privacy Attorney Tiffany Li and AI Memory". Check out the rest:

Leave a review for our podcast & we'll send you a pack of infosec cards.

Tiffany C. Li is an attorney and Resident Fellow at Yale Law School’s Information Society Project. She frequently writes and speaks on the privacy implications of artificial intelligence, virtual reality, and other technologies. Our discussion is based on her recent paper on the difficulties of getting AI to forget.

In this second part, we continue our discussion of GDPR and privacy, and examine ways to bridge the gap between tech and law. We then explore some cutting edge areas of intellectual property. Can AI algorithms own their creative efforts? Listen and learn.

[Podcast] Privacy Attorney Tiffany Li and AI Memory, Part I

[Podcast] Privacy Attorney Tiffany Li and AI Memory, Part I

This article is part of the series "[Podcast] Privacy Attorney Tiffany Li and AI Memory". Check out the rest:

Leave a review for our podcast & we'll send you a pack of infosec cards.

Tiffany Li is an attorney and Resident Fellow at Yale Law School’s Information Society Project. She frequently writes about the privacy implications of artificial intelligence, virtual reality, and other disruptive technologies. We first learned about Tiffany after reading a paper by her and two colleagues on GDPR and the “right to be forgotten”. It’s an excellent introduction to the legal complexities of erasing memory from a machine intelligence.

In this first part of our discussion, we talk about GDPR’s “right to be forgotten” rule and its origins in a law suit brought against Google. Tiffany then explains how deleting personal data is more than just removing it from a folder or directory.

We learn that GDPR regulators haven’t yet addressed how to get AI algorithms to dynamically change their rules when the underlying data is erased. It’s a major hole in this new law’s requirements!

Click on the above link to learn more about what Tiffany has to say about the gap between law and technology.

Continue reading the next post in "[Podcast] Privacy Attorney Tiffany Li and AI Memory"

IT Guide to the EU GDPR Breach Notification Rule

IT Guide to the EU GDPR Breach Notification Rule


The General Data Protection Regulation (GDPR) is set to go into effect in a few months — May 25 2018 to be exact. While the document is a great read for experienced data security attorneys, it would be nifty if we in the IT world got some practical advice on some of its murkier sections — say, the breach notification rule as spelled out in articles 33 and 34.

The GDPR’s 72-hour breach notification requirement is not in the current EU Directive, the law of the land since the mid-1990s. For many companies, meeting this tight reporting window will involve their IT departments stepping up their game.

With help from a few legal experts — thanks Sue Foster and Brett Cohen — I’ve also been pondering the language in the GDPR’s notification rule. The key question that’s not entirely answered by GPDR legalese is the threshold for reporting in real-world scenarios.

For example, is a ransomware attack reportable to regulators? What about email addresses or online handles that are exposed by hackers?

Read on for the answers.

Personal Data Breach versus Reportable Breach

We finally have some solid guidance from the regulators. Last month, the EU regulators released some answers for the perplexed, in a 30-page document covering guidelines  on breach notification – with bonus tables and flowcharts!

To refresh fading memories, the GDPR says that a personal data breach is a breach of security leading “to the accidental or unlawful destruction, loss, alteration, unauthorised disclosure of, or access to, personal data transmitted, stored or otherwise processed.”

This is fairly standard language found in any data privacy law — first define a data breach or other cybersecurity event. This is what you’re supposed to be protecting against — preventing these incidents!

There’s also additional criteria for deciding when regulators and consumers have to be notified.

In short: not every data security breach requires an external notification!

This is not unusual in data security laws that have breach report requirements. HIPAA at the federal level for medical data and New York State’s innovative cyber rules for finance make these distinctions as well. It’s a way to prevent regulators from being swamped with breach reports.

In the case of the GDPR, breaches can only involve personal data, which is EU-speak for personally identifiable information or PII. If your company is under the GDPR and it experiences an exposure of top-secret diagrams involving a new invention, then it would not be considered a personal data breach and therefore not reportable. You can say the same for stolen proprietary software or other confidential documents.

Notifying the Regulators

Under the GPDR, when does a company or data controller have to report a a personal data breach to the local supervising authority – what we used to call the local data protection authority or DPA in the old Directive?

This is spelled out in article 33, but it’s a little confusing if you don’t know the full context. In essence, a data controller reports a personal data breach — exposure, destruction, or loss of access—if this breach poses a risk to EU citizens “rights and freedoms”.

These rights and freedoms refer to more explicit property and privacy rights spelled out in the EU Charter of Fundamental Rights — kind of the EU Constitution.

I’ve read through the guidance, and just about everything you would intuitively consider a breach — exposure of sensitive personal data, theft of a device containing personal data, unauthorized access to personal data — would be reportable to regulators.

And would have to be reported within 72-hours! It is a little more nuanced and you have some wiggle room, but I’ll get to that at the end of this post.

The only exception here is if the personal data is encrypted with state of the art algorithms, and the key itself is not compromised, then the controller would not have to report it.

And a security breach that involves personal data, as defined by the EU GDPR, but that doesn’t reach the threshold of “risks to rights and freedoms”?

There’s still some paperwork you have to do!

Under the GDPR, every personal data breach must be recorded internally: “The controller shall document any personal data breaches, comprising the facts relating to the personal data breach”— see Article 33(5).

So the lost or stolen laptop that had encrypted personal data or perhaps an unauthorized access made by an employee — she saw some customer account numbers by accident because of a file permission glitch — doesn’t pose risks to rights and freedoms but it would still have to be documented.

There’s a good Venn diagram hidden in this post, but for now gaze upon the flowchart below.

Not as beautiful as a Venn diagram but this flowchart on GDPR breach report will get you the answers. (Source: Article 29 Working Party)

Let’s look at one more GDPR reporting threshold scenario involving availability or alteration of personal data.

Say EU personal data becomes unavailable due to a DDoS attack on part of a network or perhaps it’s deleted by malware but there is a backup, so that in both cases you have a loss albeit temporary — it’s still a personal data breach by the GDPR’s definition.

Is this reportable to the supervising authority?

It depends.

If users can’t gain access to say their financial records for more than a brief period, maybe a day or two, then this would impact their rights and freedoms. This incident would have to be reported to the supervising authority.

Based on the notes in the guidance, there’s some room for interpreting what this brief period would be. You’ll still need, though, to document the incident and the decision making involved.

Breach Notification and Ransomware

Based on my chats with GDPR experts, I learned there was uncertainty even among the legal eagles whether a ransomware attack is reportable.

With the new guidance, we now have a clearer answer: they actually take up ransomware scenarios in their analysis.

As we all know, ransomware encrypts corporate data for which you have to pay money to the extortionists in the form of Bitcoins to decrypt and release the data back to its plaintext form.

In the GDPR view, as I suggested above, ransomware attacks on personal data are considered a data loss. When does it cross the threshold and become a reportable data breach?

According to the examples they give, it would be reportable under two situations: 1) There is a backup of the personal data but the outage caused by the ransomware attack impacts users; or 2) There is no backup of the personal data.

In theory, a very short-lived ransomware attack in which the target recovery quickly is not reportable. In the real world where analysis and recovery takes significant time, most ransomware attacks would effectively be reportable.

Individual Reporting

The next level of reporting is a personal data breach in which there is a “high risks to the rights and freedoms.” These breaches have to reported to the individual.

In terms of Venn diagrams and subsets, we can make the statement that every personal data breach that is individually reported also has to be reported to the supervising authority. (And yes, all Greeks are men).

When does a personal breach reach the level of high risks?

Our intuition is helpful here, and the guidelines list as examples, personal data breaches that involve medical or financial (credit card or bank account numbers).

But there are other examples outside the health and banking context. If the personal data breach involves name and address of customers of a retailer who have requested delivery while on vacation, then that would be a high risk, and would require the individuals to be contact.

A breach of contact information alone — name, address, email address, etc — alone may not necessarily require notification. But would require the supervising authority and individual to be informed if a large number of individual are affected! According to the guidelines, size does matter. So a Yahoo-level exposure of email addresses would lead to notifications.

The guidelines make a point that if this contact information includes other sensitive data — psychological, ethnic, etc. — then if would be reportable regardless of the number of individuals affected.

Note: a small breach of emails without other confidential information is not reportable. (Source:Article 29 Working Party)

Or if the contact information, email addresses say, are hacked from a children’s website and therefore the group is particularly vulnerable, then this would constitute a high risk and a notification to the individuals involved.

Breach Notification in Phases

While the 72-hour GDPR breach notification rule was somewhat controversial, it’s actually more flexible once you read the fine print.

The first key point is that the clock starts ticking after the controller becomes aware of the personal data breach.

For example, suppose an organization detect a network intrusion from an attacker. That 72-hour window does not start at this point.

And then there’s an investigation to see if personal data was breach. The clock still doesn’t start. When the IT security team discovers with reasonable certainty that there has been a personal data breach, then the clock starts!

When notifying the supervising authority, the data controller can do this in phases.

It is perfectly acceptable to notify the supervising initially when there has been discovery (or the likelihood) of a personal data breach and to tell them that more investigation is required to obtain details — see Article 33(4). This process can take more than 72-hours, and is allowed under the GDPR.

And if turns out to be a false alarm, they can ask the supervising authority to cancel the notification.

For personal data breaches in which it is discovered there is a high risk to individual, the notification to affected “data subjects” must be made without “undue delay”— see Article 34(1). The objective is to inform consumers about how they’ve been affected and what they need to take to protect themselves.

Notification Details

This leads to the final topic in this epic post: what do you tell the supervising authority and individuals?

For supervising, here’s the actual language in Article 33:

  • Describe the nature of the personal data breach including where possible, the categories and approximate number of data subjects concerned and the categories and approximate number of personal data records concerned;
  • Communicate the name and contact details of the data protection officer or other contact point where more information can be obtained;
  • Describe the likely consequences of the personal data breach;
  • Describe the measures taken or proposed to be taken by the controller to address the personal data breach, including, where appropriate, measures to mitigate its possible adverse effects.

Note the requirement to provide details on the data categories and approximate number of records involved.

The supervising authority can, by the way, request additional information. The above list is the minimum that the controller has to provide.

When notifying individuals (see Article 34), the controller also has to offer the following:

  • a description of the nature of the breach;
  • the name and contact details of the data protection officer or other contact point;
  • a description of the likely consequences of the breach; and
  • a description of the measures taken or proposed to be taken by the controller to address the breach, including, where appropriate, measures to mitigate its possible adverse effects.

The GDPR prefers that the controller contact affected individuals directly – rather than through a media broadcast.  This can include email, SMS text, and snail mail.

For indirect mass communication, prominent banners on web sites, blog posts, or press releases will do fine.

The GDPR breach notification guidelines that were released last month is about 30 pages. As an IT person, you will not be able to appreciate fully all the subtleties.

You will need an attorney—your corporate counsel, CPO, CLO, etc.—to understand what’s going with this GDPR breach  guideline and other related rules.

That leads nicely to this last thought: incident response to a breach requires combined efforts of IT, legal, communications, operations, and PR, usually at the C-level.

IT can’t do it alone.

The first step is to have an incident response plan.

A great resource for data security and privacy compliance is the International Association of Privacy Professionals (IAPP) website: .

The IAPP also have a incident response toolkit put together by our attorney friends at Hogan Lovells. Check it out here.

Why A Honeypot Is Not A Comprehensive Security Solution

Why A Honeypot Is Not A Comprehensive Security Solution

A core security principle and perhaps one of the most important lesson you’ll learn as a security pro is AHAT, “always have an audit trail”. Why? If you’re ever faced with a breach, you’ll at least know what, where, and when. And some laws and regulations require audit trails as well.

To assist, there’s a smorgasbord of tools to help you monitor devices, systems, apps and logs. Since these tools monitor networks on a 24×7 basis, they generate thousands of log entries daily, often flooding admins with too much data. Beyond the reams of data, there are alerts, raising red flags and flooding in-boxes with SIEM and intrusion detection notifications.

I wonder if it just might be possible to miss the forest because of the trees?

Yes, these tools did what they were told – find this and that and another thing, trigger an alert – but with a deluge of alerts, it’s hard to pinpoint and identify what was important to investigate.

If everything is important to investigate, then nothing is important.

This is why honeypots became a beloved security tool and in some ways, patch the shortcomings of your existing monitoring tools.

What is a Honeypot?

A honeypot is essentially bait (passwords, vulnerabilities, fake sensitive data) that’s intentionally made very tempting and accessible. The goal is to deceive and attract a hacker who attempts to gain unauthorized access to your network. The honeypot is in turn being monitored by IT security. Any one caught dipping their paws into the honeypot is often assumed to be an intruder.

Advantages of a Honeypot

Before we get into why a honeypot shouldn’t be your organization’s only security solution, let’s highlight a few reasons why they are a very effective security measure in IT– especially to learn more about who might be lurking in your environment.

With a honeypot, you can learn about how the attacker entered the system,  from where (e.g., IP addresses of where the stolen data is going to and where it’s from), what’s being deleted or added ( e.g., attacker elevates his privileges to become an admin), keystrokes of a person typing, and what malware is being used (e.g., a Trojan or rootkit was added to the system).

Alerts worth investigating – As mentioned before, IT is often bombarded with thousands of alerts a day, with little or no distinction between high- and low-level risks and threats. Whereas honeypots only log a few hundred events, making it easier for IT to manage, analyze, and act more quickly, and then to evict the intruder before further damage is done.

When it comes to honeypot alerts, beware of a different kind of false positive.

For instance, an attacker can create a diversion, spoofing your production systems that pretends to attack the honeypot. Meanwhile, your honeypot would detect these spoofed attacks, steering your IT admins to investigate the wrong attack – that your production system was attacking your honeypot.  Meanwhile, during this fake alert, an attacker could focus on a real attack. Yes, hackers are clever!

Alternative to prevent ransomware– If you don’t have an automated file monitoring system, you can instead creating a honeypot with fake files, folders and then monitor regularly as, say, an alternative to preventing ransomware. Hey, why not try our home-grown PowerShell-based file monitoring solution?

Sure, you’ll have to enable file system native auditing. Keep in mind that by doing so, it will be a significant overheard on your systems. Instead, try this: prioritize and create an accessible file share that contains files that look normal or valuable, but in reality are fake.

Since no legitimate user activity, in theory, should be associated with a honeypot file share, any activity observed is more likely to be an intruder and treated as a high-level alert. After you’ve enabled native auditing to record access activity, you then can create a script to alert IT when events are written to the security event log (e.g. using dumpel.exe).

Potentially detect insider threats – Yes, it’s often assumed that any interaction with a honeypot is considered to be evidence proving you’re a hacker. After all, there’s no reason for anyone to be there.

Depending on the setup, just because your employees are triggering the alerts, they should not be automatically guilty. In a litigious world, users may argue that the employer violated their privacy because they didn’t give them permission to cull their personal data from the honeypot.

Trust, but verify.

On the other end of the spectrum, behind the firewall, using the company’s account credentials and IP address, it can be difficult to spot malicious and/or disgruntled insiders.


An insider might never use or interact with a honeypot and so it would be of little value as a research tool. Also, honeypots won’t work if the insider is aware of a honeypot or somehow discovers it. The insider will know how to avoid the honeypot, and as a result won’t log and trigger any activity.

Decrypted data – Organizations are beginning to encrypt their data. After all, it’s suggested as a best practice and for some, a compliance requirement. But technologies that protect our data like encryption can’t tell us what’s happening on our networks. That’s when honeypots are helpful. It will capture activity because honeypots act as endpoints, where the activity is decrypted.

But, Honeypots Are Not A Panacea

Try security by design instead – Similar to penetration testing, honeypots are the opposite of security by design. In order to learn more about your organization’s environment, honeypots are often installed after the system is ready. It’s very much an educational exercise, where you bring machines in to tell you where you might be vulnerable.

A more proactive way of thinking about reducing risk and improving security is to conduct the testing before you release a product or new IT environment. Require the same of your IT environment as what you require of light bulbs, food, and buildings. That’s what security by design emphasizes – build security into every part of the IT management process, starting from very beginning of the design phase.

UBA, a better way to detect outsiders, insiders and ransomware – Once an outsider enters through legitimate public ports (email, web, login) and then gains access as users, they’ve gotten very clever at implementing an attack that won’t be easy to monitor.

In fact, to an IT admin who is just monitoring their system activity, the attackers appear as just another user.

That’s where User Behavioral Analytics (UBA) can be really useful, even more effective than a honeypot!

UBA really excels at handling the unknown. In the background, the UBA engine can baseline each user’s normal activity, and then spot variances and report in real time – in whatever form they reveal themselves – Outsider? Insider? Ransomware? – they’ll be spotted. For instance, an IT admin can configure a rule to, say, spot thousands of “file modify” actions in a short time windows.

Liable for damages if your honeypot gets hijacked – Yes, you’ll expect a honeypot to be probed and attacked, but you should also consider the potential for it to be exploited.

However some honeypots introduce very little risk, such as low interaction honeypots. They’re easy to install and isn’t really a functioning operating system that an attacker can operate on. They’re mostly idle, waiting for some kind of activity. It captures very little information, only alerting you when someone visits your honeypot and that you should go observe the activity.

Whereas a high interaction honeypot is much riskier. A real operating system, it has services, programs, emails, and operates just like a real computer. It’s also more complicated to install, deploy, and requires strategic placement. You could either increase the risk of your network as a whole or no one would see it.

However, your high risk honeypot also captures more information – the IP address, in some cases the name of the individual, type of attack, how the attack was executed, and ultimately learn how to better protect your network.

Keep in mind, instead of avoiding detection, an attacker can also feed fake information to a honeypot, leading the security community to make incorrect judgements and conclusions about the attacker.

Back to the hijack.

You will have a serious problem on your hands once a honeypot gets hijacked and used to attack, infiltrate, or harm other systems or organizations. Known as downstream liability, your organization could be held liable for damages.

You’ve been warned.

Be mindful of how you implement your honeypots and choosing your security solutions wisely. Honeypots are not a good substitute if what you really need is a system such as user behavioral analytics. Instead, honeypots add value by working with existing security solutions.


My Big Fat Data Breach Cost Post, Part III

My Big Fat Data Breach Cost Post, Part III

This article is part of the series "My Big Fat Data Breach Cost Series". Check out the rest:

How much does a data breach cost a company? If you’ve been following this series, you’ll know that there’s a huge gap between Ponemon’s average cost per record numbers and the Verizon DBIR’s (as well other researcher’s). Verizon was intentionally provocative in its $.58 per record claim. However, Verizon’s more practical (and less newsworthy) results were based on using a different model that derived average record costs more in line with Ponemon’s analysis.

The larger issue, as I’ve been preaching, is that a single average for a skewed, or more precisely, a data set that follows a power law is not the best way to understand what’s going on. For a single number, the median, or the number where 50% of the data set lies below, does a better job of summarizing it all.

Unfortunately, when we introduce averages based on record counts, the problem is made even worse. Long sigh.

Fake News: Ponemon vs. Verizon Controversy

In other words, there are monster breaches in the Verizon data (based on NetDiligence’s insurance claim data) at the far end of the tail that result in hundreds of millions of records — and therefore an enormous denominator in calculating the average.

I should have mentioned last time that Ponemon’s dataset is based on breaches of less than 100,000 records. Since cyber incidents involve some hefty fixed amount costs for consulting and forensics, you’ll inevitably have a higher average when dividing the incident cost by a smaller denominator.

In brief: Ponemon’s $201 vs. Verizon’s $.58 average cost per record is a made up of controversy comparing the extremes of this weird dataset.

As I showed, when we ignore record counts and use average incident costs we get better agreement between Verizon and Ponemon – about $6 million per breach.

There’s a “but”.

Since we’re dealing with power laws, the single average is not a good representation. Why? So much of the sample is found at the beginning of the tail and the median — the incident cost where 50% of the incidents lie below — is not even close to the average!

My power law fueled analysis in the last post led to my amazing 3-tiered IOS Data Incident Cost Table©. I broke the fat-tailed dataset (based on NetDiligence’s numbers) into three smaller segments — Economy, Economy Plus, and Business Class —  to derive averages that are far more representative.

My Economy Class, which is based on 50% of the sample set, has an average incident cost of $1.4 million versus the overall average of $7.6 million. That’s an enormous difference! You can think of this average cost for 50% of the incidents as something like a hybrid of median and mean — it’s related to the creepy Lorenz curve from last time.

Ponemon and Pain

Let’s get back to the real world, and take another look at Ponemon’s survey. Their analysis is based on interviews with real people working for hundreds of companies worldwide.

Ponemon then calculates a total cost that takes in account direct expenses — credit monitoring for affected customer, forensic analysis —and fuzzier indirect costs, which can include extra employee hours and potential lost business.

These indirect costs are significant: for their 2015 survey, it represented almost 40% of the total cost of a breach!

As for the 100,000 record limit, Ponemon is well aware of this issue and warns that their average breach cost number should not be applied to large breaches. For example, Target’s 2014 data breach exposed the credit card number of over 40 million customers for a grand total of over $8 billion based on the Ponemon average. Target’s actual breach-related costs were far less.

One you go deeper into the Ponemon reports, you’ll find some incredibly useful insights.

In the 2016 survey, they note that having an incident response team in place lowers data costs per record by $16; Data Loss Prevention (DLP) takes another $8 off; and data classification schemes lop off an another $4.

Another interesting fact is that a large contributing factor to indirect costs is something called “churn”, which Ponemon defines as current customers who terminate their relationship as the result of loss of trust in the company after a breach.

Ponemon also estimates “diminished customer acquisition”, another indirect cost related to churn, which is the cost of lost future business because of damage to the brand.

These costs are based on Ponemon analysts reviewing internal corporate statistics and putting a “lifetime” value on a customer.

Feel the pain: Ponemon’s data on lost business.

Anyway, by comparing churns rates after a breach incident to historical averages, they can detect abnormal rates and then attribute the cost to the incident.

Ponemon consolidated the business lost to churn, additional acquisition costs, and damage to “goodwill” into a bar chart (above) divided by country. For the US,  the average opportunity cost of for a breach is close to $4 million.

With that in mind, it’s helpful to view the average cost per record breached as a measure of overall corporate pain.

What does that mean?

In addition to actual expenses, you can think of Ponemon’s average as also representing extra IT, legal, call center, and consultant person-days of work and emotional effort; additional attention focused in future product marketing and branding; and administrative and HR resources needed for dealing with personnel and morale issues after a breach.

All of these factors are worth considering when your organization plans its own breach response program!

Some Additional Thoughts

In our chats with security pros, attorneys, and even a small business owner who directly experienced a hacking, we learned first-hand that a breach incident is very disruptive.

It’s not just the “cost of doing” business as some have argued. In recent years, we’ve seen several CEO’s fired. More recently, with the Equifax breach, along with the C-suite leaving or “retiring”, the company’s very existence is being threatened through law suits.

There is something different about a data breach. Information on customers and executives, as well as corporate IP, can be leveraged in various creative and evil ways — identity theft attacks, blackmail, and competitive threats

While the direct upfront costs, though significant, may not reflect the $100 to $200 per record range that shows up in the press, a cyber attack resulting in a data exposure is still an expensive incident — as we saw above, over $1 million on average for most companies.

And for the longer term, Ponemon’s average cost numbers are the only measurement I know of that reflects the accounting for these unknowns.

It’s not necessarily a bad idea to be scared by Ponemon’s stats, and change your data security practices accordingly.






GDPR By Any Other Name: The UK’s New Data Protection Bill

GDPR By Any Other Name: The UK’s New Data Protection Bill

Last month, the UK published the final version of a law to replace its current data security and privacy rules. For those who haven’t been following the Brexit drama now playing in London, the Data Protection Bill or DPB will allow UK businesses to continue to do business with the EU after its “divorce” from the EU.

The UK will have data rules that are effectively the same as the EU General Data Protection Regulation (GDPR), but it will be cleverly disguised as the DPB.  Jilted lovers, separations, false identities … sounds like a real-life Shakespearean comedy (or Mrs. Doubtfire).

For businesses that have to accommodate the changes, it’s anything but.

In the Short Term

As it currently stands, the UK is under the EU’s Data Protection Directive (DPD) through its 1998 Data Protection Act or DPA, which in EU-speak “transposes” or copies the DPD into a national law. Come May 2018, the UK will fall under the GDPR, which has as a goal to harmonize  all the separate national data security laws, like the UK’s DPA, into a single set of rules, and to put in a place a more consistent enforcement structure.

Between May 2018 and whenever the UK government officially enacts the DPB, the GDPR will also be the data security and privacy law for the UK. The DPB is expected to become law before Brexit, which is schedule to occur on March 2019.

Since the GDPR will soon be the data security and privacy law in the UK, replacing the DPA, organizations have been gearing up to meet the new rules – especially, the right to erasure, 72-hour breach notification to authorities, and improved record keeping of processing activities. The DPB should, in theory, provide a relatively easy transition for UK businesses.

A Few Differences

As many commenters have pointed out (and to which I can personally attest), the DPB is not a simple piece of legislation — though you’d think it would be otherwise. The Bill starts with the premises that the GDPR rules apply to the UK, so it doesn’t even copy the actual text.

So what takes up the rest of this 200-page bill?

A good part is devoted to exemptions, restrictions, clarifications that are allowed by the GDPR and which the UK DPB takes full advantage of in the fine print

The core of the bill is found in Part 2, wherein these various tweaks — for personal data related to health, scientific research, criminal investigations, employee safety, and public interest — are laid out. The actual details — lawyers take note — is buried at the end of the DPB in a long section of “schedules”.

For example, GDPR articles related to the right to erasure, data rectification, and objection to processing don’t apply to investigations into, say, financial mismanagement or public servants misusing their office. In effect, the targets of an investigation lose control of their data.

The DPB is also complex because it contains a complete parallel set of GDPR-like security and privacy rules for law enforcement and national security services. The DPB actually transposes another EU directive, known as the EU Data Protection Law Enforcement Directive. There is also a long list of exceptions packed into even more schedules and tables at the end of document.

While the goal of Brexit may have been to get out from under EU regulations, the Data Protection Bill essentially keeps the rules in place, and gives us a lot of abbreviations to keep track of.

Business Beware: ICO’s New Audit Powers

However, it doesn’t mean there aren’t any surprises in the new UK law.

The DPB grants regulators at the UK’s Information Commission’s Office (ICO) new investigative powers through “assessment notices”. These notices allows the ICO staff to enter the organization, examine documents and equipment, and observe processing of personal data. Effectively, UK regulators will have the ability to audit an organization’s data security compliance.

Under the existing DPA, the ICO can order these non-voluntary assessments only against government agencies, such as the NHS. The DBP expands mandatory data security auditing to the private sector.

If the ICO decides the organization is not meeting DPD compliance, these audits can lead to enforcement notices that point out the security shortcomings along with a schedule of when they should be corrected.

The actual teeth in the ICO’s enforcement is their power to issue fines of up 4% of an organization’s worldwide revenue. It’s the same level of monetary penalties as in the original GDPR.

In short: the DPB is the GDPR, and smells as sweet.

For UK companies (and UK-based multinationals) that already have security controls and procedures in place — based on recognized standards like ISO 27001 — the DPB’s rules should not be a difficult threshold to meet.

However, for companies that have neglected basic data governance practices, particularly for the enormous amounts of data that are found in corporate file systems, the DPD will come as a bit of a shock.

CSOs, CIOs, and CPOs in these organizations will have to ask this question: do we want to conduct our own assessments and improve data security or let the ICO do it for us?

I think the answer is pretty obvious!

The Right to Be Forgotten and AI

The Right to Be Forgotten and AI

One (of the many) confusing aspects of the EU General Data Protection Regulation (GDPR) is its “right to be forgotten”. It’s related to the right to erasure but takes in far more ground. The right to have your personal deleted means that data held by the data controller must be removed on request by the consumer. The right to be forgotten refers more specifically to personal data the controller has made public on the Intertoobz.

Simple, right?

It ain’t ever that easy.

I came across a paper on this subject that takes a deeper look at the legal and technical issues around erasure and “forgetting”. We learn from the authors that deleting means something different when it comes to big data and artificial intelligence versus data held in a file system.

This paper contains great background on the recent history of the right to be forgotten, which is well worth your time.

Brief Summary of a Summary

Way back in 2010, a Mr. Costeja González brought a complaint against Google and a Spanish newspaper to Spain’s national Data Protection Authority (DPA). He noticed that when he entered his name into Google, the search results displayed a link to a newspaper article about a property sale made by Mr. González to resolve his personal debts.

The Spanish DPA dismissed the complaint against the newspaper —they had legal obligation to publish the property sale. However, the DPA allowed the one against Google to stand.

Google’s argument was that since it didn’t have a true presence in Spain – no physical servers in Spain held the data – and the data was processed outside the EU, it wasn’t under the EU Data Protection Directive (DPD).

Ultimately, the EU’s highest judicial body, the Court of Justice, in their right to be forgotten ruling in 2014 said that: search engine companies are controllers; the DPD applies to companies that market their services in the EU  (regardless of physical presence); and consumers have a right to request search engine companies to remove links that reference their personal information.

With the GDPR becoming EU law in May 2018 and replacing the DPD, the right to be forgotten is now enshrined in article 17 and the extraterritorial scope of the decision can be found in Article 3.

However, what’s interesting about this case is that the original information about Mr. Gonzalez was never deleted — it still can be found if you search the online version of the newspaper.

So the “forgetting” part means, in practical terms, that a key or link to the personal information has been erased, but not the data itself.

Hold this thought.

Artificial Intelligence Is Like a Mini-Google

The second half of this paper starts with a very good computer science 101 look at what happens when data is deleted in software. For non-technical people, this part will be eye opening.

Technical types know that when you’re done with a data object in an app and after the memory is erased or “freed”, the data does not in fact magically disappear. Instead, the memory chunk is put on a “linked list” that will eventually be processed and then made part of available software memory to be re-used again.

When you delete data, it’s actually put on a “take out the garbage” list.

This procedure is known as garbage collection, and it allows performance-sensitive software to delay the CPU-intensive data disposal to a later point when the app is not as busy.

Machine learning uses large data sets to train the software and derive decision making rules. The software is continually allocating and deleting data, often personal data, which at any given moment might be on a garbage collection queue waiting to be disposed.

What does it mean then to implement right to be forgotten in an AI or big data app?

The authors of the paper make the point that eliminating a single data point is not likely to affect the AI software’s rules. Fair enough. But certainly if tens or hundreds of thousands use their right to erase under the GPDR, then you’d expect some of these rules to shift.

They also note that data can be disguised through certain anonymity techniques or pseudonymization as a way to avoid storing identifiable data, thereby getting around the right to be forgotten. Some of these anonymity techniques  involve adding “noise” which may affect the accuracy of the rules.

This leads to an approach to implementing right to be forgotten for AI that we alluded to above: perhaps one way to forget is to make it impossible to access the original data!

A garbage collection process does this by putting the memory in a separate queue that makes it unavailable to the rest of the software—the software’s “handle” to the memory no longer grants access.  Google does the same thing by removing the website URL from its internal index.

In both cases, the data is still there but effectively unavailable.

The Memory Key

The underlying idea behind AI forgetting is that you remove or delete the key that allows access to the data.

This paper ends by suggesting that we’ll need to explore more practical (and economic) ways to handle right to be forgotten for big data apps.

Losing the key is one idea. There are additional methods that can be used: for example, to break up the personal data into smaller sets (or silo them) so that it is impossible or extremely difficult to re-identify each separate set.

Sure removing personal data from a file system is not necessarily easy, but it’s certainly solvable with the right products!

Agreed: AI forgetting involves additional complexity and solutions to the problem will differ from file deletion. It’s possible we’ll see some new erasure-like technologies in the AI area as well.

In the meantime, we’ll likely receive more guidance from EU regulators on what it means to forget for big data applications. We’ll keep you posted!