(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance. You can find the whole series here.)
In a single terabyte of data there are typically around 50,000 folders or containers, about 5% of which have unique permissions. If IT were to set a goal of assigning an owner for every unique ACL, they’d need to locate owners for 2,500 folders. That’s quite daunting. And most organizations aren’t dealing with a single terabyte of data; in fact, many enterprise installations we encounter are dealing with multiple petabytes of unstructured data. Clearly we need a more surgical approach to assign owners.
Varonis tackled this problem with a longtime customer who needed to identify and assign owners for more than 200 terabytes of CIFS data on their fleet of NetApp filers. There were about 40,000 users in the company, approximately 3,000 of which (as it turned out) needed to be as designated owners for some data.
When we started taking a close look at specific folders, we discovered that many of them (especially at the top of the hierarchy) simply didn’t need an owner; the only users who could read or write data, according to the ACL, were either services accounts or administrative/IT.
What we needed was a methodology for locating the folders where business users had access and a way to identify the likely owner for just those folders. So that’s what we built.
The logic went like this:
- Identify the topmost unique ACL in a tree where business users have access.
- If that ACL’s permissions allow write access to users outside of IT, it’s considered a “demarcation point.”
- For what’s left, identify higher-level demarcation points where non-IT users can only read data.
- For each demarcation point, identify the most active users
- Correlate active users with other metadata, such as department name, payroll code, managed by, etc.
The end result of this process is that each demarcation point has a likely ownership candidate. For this particular customer, the next step was to go through a survey process to confirm ownership of each demarcation point with the likely owners (as determined by Varonis’ reports). Any data without a confirmed owner was locked down to remove non-IT access and underwent a separate disposition process.
Other customers have since added content classification and other risk factors in order to better prioritize the data ownership assignment process. With a good classification scheme in place, IT is able to start assigning owners to the most critical data first.
The key takeaway from this process is we can use DatAdvantage to quickly identify the folders that need owners as well as likely owners, so IT doesn’t need to make decisions about 2500 folders per terabyte of data.
While this report was a originally a customization for one customer, we’ve now baked it right into DatAdvantage as report 12M – Recommended Base Folders.
Now that we know who our owners are, the next step is to start getting them involved. My next few posts will cover exactly how we do this using both DatAdvantage and DataPrivilege.
Image credit: gorbould