(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance. You can find the whole series here.)
Data Classification is important because it helps us figure out where the most important data sits, but it should be a goal on its own. Just understanding what data is sensitive isn’t enough to protect it. You need to understand how it’s being used, including who has access, who’s using it, and who it belongs to. You need context around the data in order to really begin to protect it. Rob Sobers put together a recent white paper on the importance of enterprise context awareness, which is worth a read and offers some great background on this topic.
Step 2: Identify Data That’s Most at Risk
The first step in our plan was to figure out what’s the most valuable by defining criteria that describe likely valuable data (e.g. content, access activity, accessibility) and then using automation to identifying where the data that matches those criteria exists in the environment. This is basically what we’re doing with DLP data at rest, if you recall. But just scanning for sensitive data isn’t enough to fix any problems, a point I’d like to illustrate by relaying a conversation I had with a customer last year. They were a mid-size educational institution of about 15,000 users and had just implemented data classification through a DLP tool. The scan took a fair amount of time, and at the end they’d identified 193,000 some-odd violations, or instances of a file containing possibly sensitive information. What the CISO told me was, “Yesterday I had one problem: where’s the sensitive data. Today I have 193,000 problems.”
It was a really concise way to summarize the problem: just finding data doesn’t really get you much. You already knew there was a lot of it out there, but knowing where it is doesn’t actually fix the problem. The goal is to restrict access to just those who need it and then monitor access so none of it is lost. To do that, you need context, and that means learning more about the data.
Since Varonis can synthesize multiple types of metadata, the next step in our methodology is to identify exactly what data is most at-risk. Which of the folders that contain those 193,000 files need to be fixed immediately?
To answer that question, Varonis combines data classification–either from our own scanning engine or from a DLP or another classification product–along with the other metadata we have available: permissions, access activity, and the user and group information from directory services. Which should be higher on your triage list for access control cleanup, a folder that contains 40 credit card numbers open to 20 people that nobody ever touches, or a folder open to the Everyone group with 300 credit card numbers that’s being constantly accessed? The latter represents a much greater risk to the organization, since looser permissions and a higher level of activity mean that data is far more likely to be deleted, stolen or misused in some way. By the way, Varonis has a built in report for this, and it’s usually one of the first things reports our customers do review when they evaluate the product.
It’s not always just about sensitive data, either. Many of our customers simply want to clean up permissions, whether sensitive or not. We’ve been hearing a lot about “open share” projects and the like lately, and it’s basically the same thing: find shared data that’s at-risk and then remediate that. DatAdvantage also has reports that help you identify where folders are open to global access groups like Everyone, Domain Users and Authenticated users as well as who is accessing data via these groups. It’s a similar example to the one above: which is more important? Data open to Everyone that nobody users or data open to Everyone that lots of people are using (and who wouldn’t have access otherwise)? Varonis can point all that out with built-in reports.
Next, we’ll look at how to go about fixing these problems.
Image credit: jurvetson