Archive for: November, 2012

Meet the Winners of the 2012 Data Governance Awards

Data Governance AwardsVaronis is proud to announce that Philip Morris International, CIBC and Alberta Energy are the three big winners in our first annual Data Governance Awards.  Société Générale, BNP Paribas and AXA were also named Highly Commended finalists.

Companies around the world are working very hard, every day, to deliver the highest possible standards of data governance. Because these projects are often hidden away in the background somewhere, they rarely get the visibility and acclaim they deserve. We are delighted to be able to publicly recognize the success of our customers, and thank them for their support of Varonis.

View the full list of winners and read the winning submissions.


Using Varonis: Fixing the Biggest Problems

(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance.  You can find the whole series here.)

Now that we have a pretty good idea where the highest-risk data is, the question naturally turns to reducing that risk. Fixing permissions problems on Windows, SharePoint or Exchange has always been a significant operational challenge. I’ve been in plenty of situations as an admin where I know something is broken—a SharePoint site open to Authenticated Users for instance—but I’ve felt powerless to actually address the problem since any permissions change carries the risk of denying access to a user (or process) who needs it. Mistakes can have significant business impact depending on whose access you broke and on what data. Since we’re defining “at-risk” as being valuable data that’s over-exposed, that means that any accessibility problems we create will impact valuable data, and that can create more problems than we started with.

Step 3: Remediate High-Risk Data

The goal is to reduce risk by reducing permissions for those users or processes that don’t require access to the data in question.

The next step in the Varonis Operational Plan is fixing those high-risk access control issues that we’ve identified: data open to global access groups as well as concentrations of sensitive information open to either global groups or groups with many users. Since simply reducing access without any context can cause problems, we need to leverage metadata and automation through DatAdvantage.

Let’s tackle global access first. When everyone can access data, it’s very difficult to know who among the large set of potential users actually needs that access. If we know exactly who’s touching the data, we can be surgical about reducing access without causing any headaches.

DatAdvantage analyzes the data’s audit record over time in conjunction with access controls, showing folders, SharePoint sites, and other repositories that are accessible by global access groups, and those users who have been accessing that data who wouldn’t have had access without a global access group. In effect, it’s doing an environment-wide simulation to answer the question, “What if I removed every global access group off every ACL tomorrow. Who would be affected?” This report gives you some key information:

  • Which data is open to global access groups
  • Which part of that data is being accessed by users who wouldn’t otherwise be able to access

And it’s not just global groups that DatAdvantage lets you do this with. Because every data touch by every user on every monitored server is logged, Varonis lets you do this kind of analysis for any user, in any group, on any file or folder. That means you can safely remediate access to all of the high-risk data without risking productivity. You can actually fix the problem without getting in anyone’s way.

The next step is to start shifting decision making from your IT staff to the people who actually should be making choices about who gets access to data: data owners.

Image credit: harwichs

Big Data, Big Brother

I’ve been writing recently about the volume and depth of personal data that’s spread out across the Internet. Out of curiosity, I registered with a well-known information broker to discover the personal details that had been collected about me. Information brokers are services that scoop up data—by mainly trawling social networks or depositing cookies at partner sites—to construct enormous databases of personal information.

This particular information broker has data on over 400 million users. The details that were gathered—minimal in my case—placed me in certain age and location demographic groups. They had also decided I was a likely smartphone user. Right on all counts. And, had I been a more active social network user, this service would have figured out my “news and current events” interests.

When this social data is linked with widely available public data—did you know that voting rolls containing name, address, full date of birth, and seven digit zip codes can be purchased for a modest fee?—a detailed personal profile begins to form. While formal rules for information brokers are still being worked out by Congress, the information that can be collected legally is quite broad.

Leaving aside the fact that users on Twitter, for instance, want to reveal personal information and even identity, most social networks will also generally share non-PII—including IP address—or aggregated information with third-parties. It’s not too difficult for a broker to connect the dots between its own data troves and data obtained through business arrangements with social networks to uniquely identify, for all practical purposes, individuals.

Information brokers are not the only ones getting into the business of mining personal data from social networking sites. Banks and other financial service companies have decided that there are useful business insights, including consumer credit worthiness, to be gleaned from “likes” and social links. However this is not without controversy. In Germany , a credit agency’s data drilling activities have recently come under the scrutiny of regulators.

One way to view the “Big Datification” of personal data is as a problem in administrating metadata. Just as users in corporate networks have control over designating whether files are off limits or shareable, consumers should in theory have similar powers, along with ability to correct inaccurate or stale meta-data.

That seems to be the way regulators both here and in EU countries will be dealing with the data collection activities of brokers and other companies going forward. In the FTC’s new privacy guidelines report that’s been coming up in my recent posts, regulators would prefer that brokers make their information available in a centralized website, saving consumers from having to search through multiple broker databases.

The report also leans towards giving consumers more power in granting access rights over their information and in certain circumstances, the ability to correct this metadata. By the way, thanks to the Fair Credit Reporting Act (FCRA) consumers currently have the right to correct credit information.  The larger question is whether this type of correction right will ultimately extend to other areas.

Back to my  situation. The information broker holding my demographic details lets users edit the data and even remove it, if they so desire.

For now, I’ve decided to leave my personal metadata as is.

Case Study: The MENTOR Network Enhances Data Protection and Reduces IT Work...

The MENTOR Network selects VaronisThe MENTOR Network, an organization offering an array of innovative human services to adults and children with developmental disabilities, has chosen Varonis to help manage and protect their network of 7,500 users and 360 locations.

One of the key challenges the Network faces is ensuring that sensitive patient data remains protected and monitored despite ever-changing user roles and responsibilities and rapid data growth.  “Everyone is creating and very few are deleting,” says Shawn Fernandes, senior director of infrastructure.

Since deploying DatAdvantage, the Network has been able to tighten up its permissions and reduce access to sensitive data so that it’s only accessed on an “as-needed” basis. In addition to controlling sensitive data and monitoring access activity, the Network has also assigned data owners, who now manage access to data via DataPrivilege.  As a result, the burden on IT has been dramatically reduced.   After its DataPrivilege rollout in Minnesota, Fernedes saw a 50% decrease in the IT workload!

Read the full case study here.

4 Secrets for Archiving Stale Data Efficiently

Stale DataThe mandate to every IT department these days seems to be: “do more with less.”  The basic economic concept of scarcity is hitting home for many IT teams, not only in terms of headcount, but storage capacity as well.  Teams are being asked to fit a constantly growing stockpile of data into an often-fixed storage infrastructure.

So what can we do given the constraints? The same thing we do when we see our own PC’s hard drive filling up – identify stale, unneeded data and archive or delete it to free up space and dodge the cost of adding new storage.

Stale Data: A Common Problem

A few weeks ago, I had the opportunity to attend VMWorld Barcelona, and talk to several storage admins. The great majority were concerned about finding an efficient way to identify and archive stale data.  Unfortunately, most of the conversations ended with: “Yes, we have lots of stale data, but we don’t have a good way to deal with it.”

The problem is that most of the existing off-the-shelf solutions try to determine what is eligible for archiving based on a file’s “lastmodifieddate” attribute; but this method doesn’t yield accurate results and isn’t very efficient, either.

Why is this?

Automated processes like search indexers, backup tools, and anti-virus programs are known to update this attribute, but we’re only concerned with human user activity.  The only way to know whether humans are modifying data is to track what they’re doing—i.e. gather an audit trail of access activity.  What’s more, if you’re reliant on checking “lastmodifieddate” then, well, you have to actually check it.  This means looking at every single file every time you do a crawl.

With unstructured data growing about 50% year over year and with about 70% of the data becoming stale within 90 days of its creation, the accurate identification of stale data not only represents a huge challenge, but also a massive opportunity to reduce costs.

4 Secrets for Archiving Stale Data Efficiently

In order for organizations to find an effective solution to help deal with stale data and comply with defensible disposition requirements, there are 4 secrets  to efficiently identify and clean-up stale data:

1. The Right Metadata

In order to accurately and efficiently identify stale data, we need to have the right metadata – metadata that reflects the reality of our data, and that can answer questions accurately. It is not only important to know which data hasn’t been used in the last 3 months, but also to know who touched it last, who has access to it, and if it contains sensitive data. Correlating multiple metadata streams provides the appropriate context so storage admins  can make smart, metadata-driven decisions about stale data.

2. An Audit Trail of Human User Activity

We need to understand the behavior of our users, how they access data, what data they access frequently, and what data is never touched. Rather than continually checking the “lastmodifieddate” attribute of every single data container or file, an audit trail gives you a list of known changes by human users.  This audit trail is crucial for quick and accurate scans for stale data, but also proves vital for forensics, behavioral analysis, and help desk use cases (“Hey! Who deleted my file?”).

3. Granular Data Selection

All data is not created equally.  HR data might have different archiving criteria than Finance data or Legal data.  Each distinct data set might require a different set of rules, so it’s important to have as many axes to pivot on as possible.  For example, you might need to select data based on its last access data as well as the sensitivity of the content (e.g., PII, PCI, HIPAA) or the profile of the users who use the data most often (C-level vs. help desk).

The capability to be granular when slicing and dicing data to determine, with confidence, which data will be affected (and how) with a specific operation will make storage pros lives much easier.

4. Automation

Lastly, there needs to be a way to automate data selection, archival, and deletion. Stale data identification cannot consume more IT resources; otherwise the storage savings diminish.  As we mentioned at the start, IT is always trying to do more with less, and intelligent automation is the key. The ability to automatically identify and archive or delete stale data, based on metadata will make this a sustainable and efficient task that can save time and money.

Interested in how you can save time and money by using automation to turbocharge your stale data identification?  Request a demo of the Varonis Data Transport Engine.

Photo credit: austinevan