Archive for: November, 2012

Cloud Data Protection in the EU: The Road Back From Serfdom

A few posts ago, I wrote about the European Union’s influential 1995 Data Protection Directive and the updates to its consumer privacy rules that will soon go into effect. Over the summer, the Article 29 Working Group, which is a DPD advisory body, released a set of rules clarifying the regulatory structure around cloud providers. While the US is still working out its online consumer privacy regulations, this recent EU rulemaking sets a high bar for consumer data security: EU companies can’t eliminate basic data privacy rights by storing the personal data of their customers in the cloud.

In the language of DPD, the Working Group considers cloud providers to be “data processors”. By nailing down this designation, the rest of the existing DPD framework then falls into place. And therefore DPD obligations related to security protections, accuracy, and limits on data retention remain in effect.

A company (data controller in DPD-ese) that’s looking for a cloud provider is allowed to contract only with a service that “guarantees compliance with [EU] data protection legislation.”

Since EU companies have the ultimate responsibility (and take on most or all liabilities) for protecting customer data, it’s up to them to include the appropriate contract terms with their providers. I’ve pulled together a few of these key contract clauses from the Working Group document:

  • SLAs –  “objective and measurable” and should list “relevant penalties (financial or otherwise including the ability to sue the [cloud] provider in case of non-compliance)”
  • Authorization –  “processor [the cloud provider] is to follow the instructions of the controller”
  • Access to data – “only authorized [cloud provider] persons should have access to the data”
  • Consumers’ access rights –  cloud provider should “support the client in facilitating exercise of data subjects’ [consumers’] rights to access, correct or delete their data
  • Logging and auditing – “client should request logging of processing operations performed by the provider” and the client “should be empowered to audit such processing operations”
  • Technical measures – a series of technical requirements, key among them are ones relating to availability, data integrity, confidentiality (i.e., encryption),  and portability

This contracting standard is especially significant since cloud companies can be physically located anywhere, and more to the point, outside the EU. In effect, European-based companies that collect consumer data of EU citizens can’t export the data and then process it in a place with a lax consumer security environment–the cloud outsourcer must meet DPD-level standards of data protection.

In case you’re wondering whether the DPD governs US cloud providers–say Amazon or Google–the answer has, up until this Working Group cloud rules document, been a qualified “yes”.  US data processors have had a unique safe harbor relationship with the EU. If they are working with an EU company, they’re allowed to self-certify themselves with respect to seven principles that mirror the DPD rules for EU-based data processors.

However the new Working Group rules say that EU companies need to obtain direct evidence from US providers “that the Safe Harbor self-certifications exists [emphasis added] and request evidence demonstrating that their principles are complied with.”

Amazon, Google, GoDaddy, and other US cloud providers: you’ve been warned!

Speaking of data privacy practices of US-based cloud providers, Rob has an interesting post on how the convenience of cloud computing has lulled consumers and companies into a one-sided relationship. And at least one well-known security analyst has described it as more like an EULA between serfs and lords. Hint: the cloud-providers are the lords.

EU countries have made a giant step towards balancing the power relationships in the digital age between cloud provider and companies. As a side effect, US cloud providers will need to change their privacy practices, at least if they want to do business in the EU.

While feudal relationships were unknown in North America, I might add that some of the DPD’s ideas would make for good business practices in US-US transactions.

Using Varonis: Why Data Owners?

(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance.  You can find the whole series here.)

Data OwnersOne of my first jobs in IT was on the help desk for a medium-sized company. A big part of my job was provisioning access. If your company has shared data (and what organization doesn’t?), the words “I need access to this folder” are probably very familiar to you.

There are countless reasons for modifying access controls: new hires, consultants, role changes, temporary projects, cross-functional teams, terminations, department restructuring, M&A – the list goes on.  Coordinating who has access to which data has—detrimentally—became a core responsibility of IT.

Let’s peak inside a typical permissions conversation between an end-user and the help desk:

User (to the Help Desk): I need access to a folder in the S: drive, can you help?
Help Desk: Of course. Can you tell me which folder?
User: The folder is called FYQ3-docs. I need access for the next few weeks.
Help Desk: Do you know who manages the folder? To make this change we need an approval.
User: My boss asked me to get access. I can forward you the email?
Help Desk: Sure, that will be good enough.

Look familiar?

In some organizations, this process may be a little more complicated, a little more automated, or both, but in general the process follows this workflow: access is requested by a user, approved by that user’s manager, and provisioned by someone in IT.

That’s the way it’s been done for years, and it works great, right?  Well, not really.  This ostensibly innocent access provisioning workflow can be the seed for the most costly data breaches an organization will ever face.

The wrong people

In this example, the user’s manager is the one providing the approval. That person may not be, and in fact usually isn’t, the person who should be making this decision. The data itself is a business asset, so access to that data is a business decision. That means that the owner of that asset—i.e., the data owner—should be the one making the decision.

Imagine if access to a financial account worked the same way as access to a shared folder—managers would be able to get access for their team without the actual budget owner having any idea about it.  Madness!

Organizations that have an excellent grasp on data ownership and information governance have not only figured out a way to ensure approval is granted by the right person, but they’ve factored the help desk out of the equation completely, freeing up precious resources.

A recent article on the Harvard Business Review blog states:

Different kinds of assets, people, capital, technology, and data demand different kinds of management. You don’t manage people assets the same way you manage capital assets. Nor should you manage data assets in the same way you manage technology assets. This may be the most fundamental reason for moving responsibility for data out of IT.”

Let’s now re-envision the access provisioning scenario:

  • User fills out a web form describing which data she needs access to, why, and for how long.
  • Request gets automatically routed to the business person in the organization who is best equipped to approve the request – i.e., the data owner.
  • Data owner approves or denies the request by clicking a button.

Much better!  The access request is fulfilled by the correct person without involved the requestor’s manager or IT.

Easier said than done

The hard part here, and the reason things have traditionally worked this way, is that when it comes to shared data, we don’t have a good way of figuring out who the actual owner is. IT may have some idea based on group access—if there’s a single group that grants access to a folder, you may be able to figure out the director or manager of that group, for instance. But what happens if data is open to two or three different teams? What about data open to everyone? Identifying and aligning owners is extraordinarily difficult if you rely on traditional methods.

With Varonis, there’s a much better way. Because DatAdvantage is constantly gathering a complete audit record, we can use aggregate access activity to identify likely owners. If the three or four most active users of a folder all report to the same person, it’s highly likely that person is the true data owner. At worst, you’re one phone call away from knowing.

By identifying business owners of data, IT can take the first step toward shifting the burden to the teams who have the right context (and often authority) to be making decisions about access. One challenge with this approach is figuring out which folders actually need owners, something I’ll talk about in the next post.

Image credit: richard-g

Meet the Winners of the 2012 Data Governance Awards

Data Governance AwardsVaronis is proud to announce that Philip Morris International, CIBC and Alberta Energy are the three big winners in our first annual Data Governance Awards.  Société Générale, BNP Paribas and AXA were also named Highly Commended finalists.

Companies around the world are working very hard, every day, to deliver the highest possible standards of data governance. Because these projects are often hidden away in the background somewhere, they rarely get the visibility and acclaim they deserve. We are delighted to be able to publicly recognize the success of our customers, and thank them for their support of Varonis.

View the full list of winners and read the winning submissions.


City-state Security

Bruce Schneier wrote an interesting piece in Wired yesterday likening cloud providers to feudal lords.  Schneier states that we’ve come to a point in the Internet era where users are putting a tremendous amount of trust in companies like Facebook and Amazon in exchange for convenience.  It’s never been more apparent – we want access to our data anywhere, anytime, without the hassle of VPNs, encryption, and other annoying “obstacles.”

Long live the king!

As Bruce points out, for some individuals and small companies, being a vassal can work in our favor – after all, our lords know better than we do (or do they?).  But what about multi-billion dollar enterprises with much more at stake?

Schneier says:

“These organizations are used to trusting other companies with critical corporate functions: They’ve been outsourcing their payroll, tax preparation, and legal services for decades. But IT regulations often require audits. Our lords don’t allow vassals to audit them, even if those vassals are themselves large and powerful.

Even if you yourself are a powerful lord, when your king calls upon you, you obey.

Enterprises realize that convenience is often the enemy of security and trust is sometimes betrayed.  Managing and protecting IP, customer data, patient data, etc. is too important to hand off to a hopefully-benevolent lord who has an entire realm to rule.  As Chris Dixon says: outsource the things you don’t care about.

Interestingly, Bruce views the outcome as binary: convenience XOR security.  I think there’s an unstated middle-ground here: a new breed of enterprise-class solutions that offer the same luxuries the kings provide, but are deployed and controlled autonomously by the organization.   Call it city-state security.

Using Varonis: Fixing the Biggest Problems

(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance.  You can find the whole series here.)

Now that we have a pretty good idea where the highest-risk data is, the question naturally turns to reducing that risk. Fixing permissions problems on Windows, SharePoint or Exchange has always been a significant operational challenge. I’ve been in plenty of situations as an admin where I know something is broken—a SharePoint site open to Authenticated Users for instance—but I’ve felt powerless to actually address the problem since any permissions change carries the risk of denying access to a user (or process) who needs it. Mistakes can have significant business impact depending on whose access you broke and on what data. Since we’re defining “at-risk” as being valuable data that’s over-exposed, that means that any accessibility problems we create will impact valuable data, and that can create more problems than we started with.

Step 3: Remediate High-Risk Data

The goal is to reduce risk by reducing permissions for those users or processes that don’t require access to the data in question.

The next step in the Varonis Operational Plan is fixing those high-risk access control issues that we’ve identified: data open to global access groups as well as concentrations of sensitive information open to either global groups or groups with many users. Since simply reducing access without any context can cause problems, we need to leverage metadata and automation through DatAdvantage.

Let’s tackle global access first. When everyone can access data, it’s very difficult to know who among the large set of potential users actually needs that access. If we know exactly who’s touching the data, we can be surgical about reducing access without causing any headaches.

DatAdvantage analyzes the data’s audit record over time in conjunction with access controls, showing folders, SharePoint sites, and other repositories that are accessible by global access groups, and those users who have been accessing that data who wouldn’t have had access without a global access group. In effect, it’s doing an environment-wide simulation to answer the question, “What if I removed every global access group off every ACL tomorrow. Who would be affected?” This report gives you some key information:

  • Which data is open to global access groups
  • Which part of that data is being accessed by users who wouldn’t otherwise be able to access

And it’s not just global groups that DatAdvantage lets you do this with. Because every data touch by every user on every monitored server is logged, Varonis lets you do this kind of analysis for any user, in any group, on any file or folder. That means you can safely remediate access to all of the high-risk data without risking productivity. You can actually fix the problem without getting in anyone’s way.

The next step is to start shifting decision making from your IT staff to the people who actually should be making choices about who gets access to data: data owners.

Image credit: harwichs

Big Data, Big Brother

I’ve been writing recently about the volume and depth of personal data that’s spread out across the Internet. Out of curiosity, I registered with a well-known information broker to discover the personal details that had been collected about me. Information brokers are services that scoop up data—by mainly trawling social networks or depositing cookies at partner sites—to construct enormous databases of personal information.

This particular information broker has data on over 400 million users. The details that were gathered—minimal in my case—placed me in certain age and location demographic groups. They had also decided I was a likely smartphone user. Right on all counts. And, had I been a more active social network user, this service would have figured out my “news and current events” interests.

When this social data is linked with widely available public data—did you know that voting rolls containing name, address, full date of birth, and seven digit zip codes can be purchased for a modest fee?—a detailed personal profile begins to form. While formal rules for information brokers are still being worked out by Congress, the information that can be collected legally is quite broad.

Leaving aside the fact that users on Twitter, for instance, want to reveal personal information and even identity, most social networks will also generally share non-PII—including IP address—or aggregated information with third-parties. It’s not too difficult for a broker to connect the dots between its own data troves and data obtained through business arrangements with social networks to uniquely identify, for all practical purposes, individuals.

Information brokers are not the only ones getting into the business of mining personal data from social networking sites. Banks and other financial service companies have decided that there are useful business insights, including consumer credit worthiness, to be gleaned from “likes” and social links. However this is not without controversy. In Germany , a credit agency’s data drilling activities have recently come under the scrutiny of regulators.

One way to view the “Big Datification” of personal data is as a problem in administrating metadata. Just as users in corporate networks have control over designating whether files are off limits or shareable, consumers should in theory have similar powers, along with ability to correct inaccurate or stale meta-data.

That seems to be the way regulators both here and in EU countries will be dealing with the data collection activities of brokers and other companies going forward. In the FTC’s new privacy guidelines report that’s been coming up in my recent posts, regulators would prefer that brokers make their information available in a centralized website, saving consumers from having to search through multiple broker databases.

The report also leans towards giving consumers more power in granting access rights over their information and in certain circumstances, the ability to correct this metadata. By the way, thanks to the Fair Credit Reporting Act (FCRA) consumers currently have the right to correct credit information.  The larger question is whether this type of correction right will ultimately extend to other areas.

Back to my  situation. The information broker holding my demographic details lets users edit the data and even remove it, if they so desire.

For now, I’ve decided to leave my personal metadata as is.

Case Study: The MENTOR Network Enhances Data Protection and Reduces IT Work...

The MENTOR Network selects VaronisThe MENTOR Network, an organization offering an array of innovative human services to adults and children with developmental disabilities, has chosen Varonis to help manage and protect their network of 7,500 users and 360 locations.

One of the key challenges the Network faces is ensuring that sensitive patient data remains protected and monitored despite ever-changing user roles and responsibilities and rapid data growth.  “Everyone is creating and very few are deleting,” says Shawn Fernandes, senior director of infrastructure.

Since deploying DatAdvantage, the Network has been able to tighten up its permissions and reduce access to sensitive data so that it’s only accessed on an “as-needed” basis. In addition to controlling sensitive data and monitoring access activity, the Network has also assigned data owners, who now manage access to data via DataPrivilege.  As a result, the burden on IT has been dramatically reduced.   After its DataPrivilege rollout in Minnesota, Fernedes saw a 50% decrease in the IT workload!

Read the full case study here.

4 Secrets for Archiving Stale Data Efficiently

Stale DataThe mandate to every IT department these days seems to be: “do more with less.”  The basic economic concept of scarcity is hitting home for many IT teams, not only in terms of headcount, but storage capacity as well.  Teams are being asked to fit a constantly growing stockpile of data into an often-fixed storage infrastructure.

So what can we do given the constraints? The same thing we do when we see our own PC’s hard drive filling up – identify stale, unneeded data and archive or delete it to free up space and dodge the cost of adding new storage.

Stale Data: A Common Problem

A few weeks ago, I had the opportunity to attend VMWorld Barcelona, and talk to several storage admins. The great majority were concerned about finding an efficient way to identify and archive stale data.  Unfortunately, most of the conversations ended with: “Yes, we have lots of stale data, but we don’t have a good way to deal with it.”

The problem is that most of the existing off-the-shelf solutions try to determine what is eligible for archiving based on a file’s “lastmodifieddate” attribute; but this method doesn’t yield accurate results and isn’t very efficient, either.

Why is this?

Automated processes like search indexers, backup tools, and anti-virus programs are known to update this attribute, but we’re only concerned with human user activity.  The only way to know whether humans are modifying data is to track what they’re doing—i.e. gather an audit trail of access activity.  What’s more, if you’re reliant on checking “lastmodifieddate” then, well, you have to actually check it.  This means looking at every single file every time you do a crawl.

With unstructured data growing about 50% year over year and with about 70% of the data becoming stale within 90 days of its creation, the accurate identification of stale data not only represents a huge challenge, but also a massive opportunity to reduce costs.

4 Secrets for Archiving Stale Data Efficiently

In order for organizations to find an effective solution to help deal with stale data and comply with defensible disposition requirements, there are 4 secrets  to efficiently identify and clean-up stale data:

1. The Right Metadata

In order to accurately and efficiently identify stale data, we need to have the right metadata – metadata that reflects the reality of our data, and that can answer questions accurately. It is not only important to know which data hasn’t been used in the last 3 months, but also to know who touched it last, who has access to it, and if it contains sensitive data. Correlating multiple metadata streams provides the appropriate context so storage admins  can make smart, metadata-driven decisions about stale data.

2. An Audit Trail of Human User Activity

We need to understand the behavior of our users, how they access data, what data they access frequently, and what data is never touched. Rather than continually checking the “lastmodifieddate” attribute of every single data container or file, an audit trail gives you a list of known changes by human users.  This audit trail is crucial for quick and accurate scans for stale data, but also proves vital for forensics, behavioral analysis, and help desk use cases (“Hey! Who deleted my file?”).

3. Granular Data Selection

All data is not created equally.  HR data might have different archiving criteria than Finance data or Legal data.  Each distinct data set might require a different set of rules, so it’s important to have as many axes to pivot on as possible.  For example, you might need to select data based on its last access data as well as the sensitivity of the content (e.g., PII, PCI, HIPAA) or the profile of the users who use the data most often (C-level vs. help desk).

The capability to be granular when slicing and dicing data to determine, with confidence, which data will be affected (and how) with a specific operation will make storage pros lives much easier.

4. Automation

Lastly, there needs to be a way to automate data selection, archival, and deletion. Stale data identification cannot consume more IT resources; otherwise the storage savings diminish.  As we mentioned at the start, IT is always trying to do more with less, and intelligent automation is the key. The ability to automatically identify and archive or delete stale data, based on metadata will make this a sustainable and efficient task that can save time and money.

Interested in how you can save time and money by using automation to turbocharge your stale data identification?  Request a demo of the Varonis Data Transport Engine.

Photo credit: austinevan

Can Companies Learn to Forget About Their Customers?

In my last few posts, I’ve been focusing on how the rise of social media has forced regulators both here and in the EU to revise their definitions of personal data. With the new emphasis on data that can be  “reasonably linked” to an individual, companies may soon have to extend their security controls over a broader range of consumer information. Interestingly, just as the amount and breadth of personal data is increasing, another almost opposite regulatory requirement is looming on the horizon.

In the FTC privacy guidelines report that I’ve been referring to recently, the agency’s Commissioners refer to a “right to be forgotten”. This is not a legal right in the US (yet). But the concept that data should have a natural shelf life and not be retained longer than necessary is reflected in the language of the report’s framework, which calls for “reasonable collection limits” and “sound retention policies”.

Translation: Companies should delete data they no longer need and also allow consumers to access the data and under “appropriate circumstances” purge or suppress it.

Keeping in mind that these are guidelines and best practices, there’s a great hypothetical case study in the FTC report for the mobile space. The Commissioners point out that GPS generated location data, which is often monitored and saved by smartphone apps, should be treated as identifying information. The reason? Geo coordinates can be used to re-identify customers when connected with other “disparate bits of information”.

In this particular example, the FTC suggests that mobile software companies should limit their retention of business data—say check-ins to a restaurant—and also their sharing of it with third parties.

As I’ve been pointing out, public data on the web — especially on social sites—actually expands the amount of corporate consumer data that would fall under the “reasonably linked” definition. And with this new FTC approach for retention, it means that more data is now a candidate for deletion as well.

Back at the EU, the right to be forgotten is an important part of the planned update to their Data Protection Directive. Unlike the US, it will have the weight of law as EU member countries implement the new rules over the next few years. It will give citizens the right to delete data on request.

In the US, the FTC report may provide clues as to what may be coming out of Congress. The McCain-Kerry Commercial Privacy Bill of Rights, which is currently stalled, does have provisions for data retention limits of personally identifiable information (PII) and other information that may be reasonably used to identify an individual. It also gives consumers some control over their data: they can request that PII and other information be made unidentifiable or not usable. This is less strong than the EU’s right to be forgotten, but it would still require US companies to at least find personal data and then corral it.

The writing is on the wall for US companies: those that implement the FTC best practices for data deletion will be in a much better position when either McCain-Kerry’s Bill of Rights or another law is passed that makes consumer data retention  limits and deletion not just a good idea for companies, but a legal obligation.


Email vs. Employee: Can We Win the Inbox Race? [INFOGRAPHIC]

Many of us think we are masters of our email universe. By sending messages from a desktop app or mobile device, we’re able to direct and coordinate coworkers anywhere in the organization to solve problems and ultimately get the work done.  In a newly released research report, Varonis notes that an ever increasing volume of emails is forcing knowledge workers to allocate significant time and effort to managing their inbox.

In our survey, we’ve learned that nearly 25% of our respondents receive between 100 – 500 emails per day. And nearly 85% are spending up to 30 minutes or more every day organizing their messages — over one and one half weeks of work every year. With 22% reporting that there are between 1000 – 5000 emails always contained within their inbox at any given time, filing and organizing message traffic is a hidden task on many workers’ to-do list.

More than just a productivity drain, the survey reveals that email mishaps are causing real harm: over 62% of our survey population report email mishaps that in some cases led to job loss and even compliance violations.

Is there a way out of the email race? We conclude that email filtering and routing software along with other techniques to monitor email loads may become more common place in enterprise environments.

In the meantime, one immediate solution, as some email observers have noted, is to be more careful in crafting electronic correspondence. Besides double checking the email meta-data—to and cc lists, and file attachments—smart users make sure that their email content is short, meaningful, and solves a problem.

Download the Report

Enjoy, share, embed our infographic and download the full report to learn which data protection activities truly matter.

Digital Work Habits

Embed this infographic on your own site

Copy and paste the code below into your blog post or web page:

<a href=""><img title="The Quest for Inbox Zero - Infographic" src="" alt="The Quest for Inbox Zero" width="600" /></a>
<p><small>Like this infographic? Get more <a href="">digital collaboration</a> tips from <a href="">Varonis</a>.</small></p>

At-Risk Exchange Data

MailboxesOne of the more interesting benefits of  last year’s launch of DatAdvantage for Exchange was the opportunities it presented to talk with different sets of people in our customers’ organizations. Where traditionally we’d worked mostly with security, storage, Windows or Active Directory teams, DatAdvantage for Exchange spurred meetings with messaging, e-Discovery and legal folks as well.

E-mail is a business-critical system, period. From an IT perspective, it may be the most critical system—most companies would rather lose their phones for a day than their e-mail. What that has meant for the Messaging folks in charge of Exchange is that simply keeping the lights on—making sure that emails are being delivered promptly and that the repository of stored data is available—has been far and away more important than access control. However, the consequence of focusing on availability rather than confidentiality or integrity has meant that a lot of the controls and auditing that should be in place are sorely lacking.

Data Governance and Exchange

Exchange is an interesting repository from a data governance perspective. The last time I wrote about using Varonis, I talked about how we can combine data classification with permissions exposure to identify the data that’s most at-risk on a file system or SharePoint site. Unlike a file share, the hierarchy is flat—everyone’s got their own mailbox, and it’s very easy to share out access rights to it. You can, for instance, give someone access to your inbox or calendar. With IT’s help, you can give them the ability to send email on your behalf, or even “as” you. Exchange is exactly like file shares in that mailbox access is reviewed  periodically, mailboxes stay shared and users have send-as or send-on-behalf-of privileges for a long, long time.

What’s at Risk?

One of the first things we do when we spin up DatAdvantage for Exchange for a customer is to run a report that shows them everywhere someone in the organization has access to a mailbox that isn’t their own.

Everyone has access to their own mailbox by default. It takes some sort of permissions change, though, either on the client (Outlook) side, or by the admin on the Exchange server, to grant someone access to another mailbox. One of things we’re seeing when we do this, by the way, it that the mailboxes that are without question most likely to have been shared are those that are probably considered the most valuable—those of the CEO and other high-level management.  While native tools might let you manually (and somewhat painfully) check permissions on a mailbox-by-mailbox basis, Varonis gives you the ability to see where anyone has access to an object that’s not part of their own mailbox.

We take that risk assessment a step further, too, with another report that will show you where people are actually accessing data in mailboxes that don’t belong to them. For good or ill, these are probably the permissions you want to take a look at first from a governance perspective.

Photo credit: dcJohn