(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance. You can find the whole series here.)
All organizational data needs an owner. It’s that simple, right? I think most of us would be hard pressed to argue against that as a principle—the data itself is an organizational asset, so of course it’s not the Help Desk or AD Admin folks who own it, it’s the users or business units that should own it. Of course, that’s great in theory, but with 1, 5, 10, or even 20 years’ worth of shared, unstructured data, figuring out who owns data is far from simple, let alone involving those owners in any meaningful way.
Before we get into using Varonis to locate owners, I want to talk about why finding a single data owner can be such a problem. IT probably knows who owns the Finance folder. It’s the CFO or a delegated steward. Same with HR, Marketing or Legal—these tend to be clearly-delineated departmental shares and it’s not hard to figure out whom to go to if we need an informed decision. (Regularly involving those owners in data governance is a different problem, and one I will cover in future posts.) The identification for these folders is relatively straightforward.
But what happens if you need to find the owner of a folder that has a less obvious name? What if the folder’s name is a project ID, or an acronym of some kind? In my experience, a majority of unstructured data resides in folders that aren’t obviously owned by anyone.
What IT tends to do then is a few different things:
- Check the ACL and see which groups have access. If it’s a single group with an obvious owner, that’s a likely candidate. If the ACL contains many different groups or a global access group like Domain Users, though, this tactic tends to fail.
- Check the Windows owner under Special Permissions. This metadata can be helpful, but can also be a red herring since it’s often just set to the local Administrator of the server. Even if there’s actually a human user there (who likely created the folder), that value may be outdated or inaccurate.
- Check the owner of files within the folder. Same problems as above.
- Enable operating system auditing to identify the most active user. Anyone out there excited about turning on file level auditing in Windows? I have yet to talk to anyone who answers yes to this question because of the performance hit on the server as well as the storage required and expertise to parse the logs effectively.
- Turn off access and see who complains. Not an optimal strategy when it comes to critical data.
- Email the world and hope for a response. In general, people don’t want to take ownership of something without good reason, since it may mean more work. How confident are you that the proper owners (who may be at a management or director level) are going to know exactly which data sets their teams are using regularly? If they’re not sure, are they going to jump to take responsibility?
So finding owners is hard, let alone finding owners at scale. If you’ve got thousands of unique ACLs and you want owners for all of them (or at least the ones that make sense) you’re going to have to go through some version of this process for each one. It’s no wonder we haven’t done a good job of this over time. Thankfully, there’s a better way.
Step 4: Identify Data Owners
The key difference between attempting to solve this problem manually and attacking it intelligently with Varonis is the DatAdvantage audit trail. A normalized, continuous, non-intrusive audit record of all data access is a key piece of DatAdvantage, and it allows us to actually identify data owners at scale without having to hunt and peck. Once you start gathering usage data and rolling it up into high level stats you can start to see the likely owners of any data set, not just the obvious ones.
DatAdvantage gives you two straightforward ways to get this information: First, we can quickly take a look at a high-level view of a single folder within the Statistics pane of the DatAdvantage GUI. This will show us the most active users of a particular folder. We like to say that at most, you’re one phone call away, since if the most active user isn’t the data owner, they almost certainly know who is.
You can operationalize this process even further by creating a statistics report, which can be run on an entire tree or even a server. A single report can show the top users of every unique ACL, and it’s possible to set up advanced filters to make this even more useful—showing only users outside of IT or in a specific OU, for example. You can even add additional properties from AD to the report, showing each user’s department or line manager, if available. None of this is possible without constantly gathering access activity and providing an interface to combine it with other available metadata.
Identifying owners is useful, but actually involving them is where IT can really start to make headway when it comes to ongoing governance. We’ll tackle that next.