All posts by Matt Radolec

Defining Deviancy With User Behavior Analytics

Defining Deviancy With User Behavior Analytics

For over the last 10 years, security operations centers and analysts have been trading indicators of compromise (IoC), signatures or threshold-based signs of intrusion or attempted intrusion, to try to keep pace with the ever-changing threat environment. It’s been a losing battle.

During the same time, attackers have become ever more effective at concealing their activities. A cloaking technique, known as steganography, has rendered traditional signature and threshold-based detective measures practically useless.

In response, the security industry has seen new demand for User Behavior Analytics (UBA), which looks for patterns of activity and mathematically significant deviations of user behaviors (app usage, file searching activities), from historical baselines.

I’m often asked about what makes UBA different from traditional SIEM-based approaches.

Know Thy Behavioral History

In my mind, the answer is history! You know that old saying if you don’t remember the past, you’ll be condemned to repeat it? That’s applies to a pure SIEM-based approach that’s looking at – pun intended – current events: files deleted or copied, failed logins, malware signatures, or excessive connection requests from an IP address.

Of course, you need to look at raw events, but without context, SIEM-based stats and snapshots are an unreliable signal of what’s really happening. We call these “false positive when a SIEM system seems to indicate an alert when there’s not one. At some point, you end up continually chasing the same false leads, or, even worse, ignoring them all together — “dial-tone deaf”.

How many files are too many when a user is deleting or copying? How many failed logins are unusual for that particular user? When does it become suspicious for a user who visits a rarely accessed folder?

The key decision that has to be made for any event notification is  the right threshold to separate normal from abnormal.

Often there are tens, if not hundreds, or thousands of applications, and user accesses, each with a unique purpose and set of thresholds, signature, and alerts to configure and monitor. A brute-force approach results in rules not based on past data but on ad hoc, it-feels-right settings that generate endless reports and blinking dashboards that require a team of people to sift out the “fake news”.

This dilemma over how to set a threshold has led security researchers to a statistical approach, where thresholds are based on an analysis of real-world user behaviors.

The key difference between UBA and monitoring techniques that rely on static thresholds is that the decision to trigger is instead guided by mathematical models and statistical analysis that’s better able to spot true anomalies, ultimately reducing false positives. Some examples of behavioral alerts:

  • Alert when a user accesses data they has rarely been accessed before, at a time of day that’s unusual for that user — 4 AM Sunday — and then emails it to an ISP based in Croatia
  • Alert when a user has a pattern of failed login events over time that is outside the normal behavior
  • Alert when a user copies files from another users’ home directory, and then moves those files to a USB

A Simple UBA Example

The reason UBA is so effective is that it doesn’t depend only on signature- or static threshold-based analytics.

Let’s break this down with an example.

At Acme Inc., the security team has been asked to monitor the email activity of all of its 1,000 employees. Impossible, no?

We can understand the larger problem by focusing on just 5 users (0.5% of all users)  First, we apply traditional analytics and review their email activity (below) over the course of a week.

User Monday Tuesday Wednesday Thursday Friday
Andy 10 8 30 15 13
Molly 15 29 55 33 90
Ryan 35 6 7 15 16
Sam 2 5 4 9 15
Ivan 9 1 3 5 0

Looking at this report, you might decide to investigate the users who sent the most emails, right?

You quickly learn that Molly, who sent 90 emails on Friday, is with the marketing team and her performance is based on how many customers she emails in a day. False lead!

You then decide you’re going to take the average from all the users for each day of the way. You craft a static threshold alert whenever the user sends more emails than the average for a given day.  For the data set above, the average amount of emails sent by a user on any given day is 17.

If you created an alert for anytime a user sends more than 17 emails in a day you would’ve received 6 alerts during this time frame. Four of these alerts would bring you right back to Molly, the queen of email.

User Monday Tuesday Wednesday Thursday Friday
Andy 10 8 30 15 13
Molly 15 29 55 33 90
Ryan 35 6 7 15 16
Sam 2 5 4 9 15
Ivan 9 1 3 5 0

 

This threshold is obviously too sensitive. You need a different strategy than a raw average for all users on a given day — the vertical column.

UBA’s anomaly detection algorithm looking at each user, each day, and records information around their activity.  This historical information, sliced by day, time, and other dimensions, is stored in the system so baseline statistics can be created.

Think of it as the UBA tool running the reports and figuring out the averages and standard of deviations for each user, comparing it to their peers, and over time escalating only those users and activities that ‘stand out from the crowd’. UBA is also calculating averages, standard deviations, and other stats dynamically over time, so that they reflect possible shifts in the historical trends.

For example, here’s a possible behavioral rule: Alert when a user deviates from their baseline of normal activity when sending emails.

This could be translated more precisely as ‘notify when a user is two or more standards of deviation away from their mean’.

If we adopt this rule, we analyze the horizontals rows — in this case, I used a convenient online statistical calculator — to come up with 2 alerts, though none for Molly .

User Monday Tuesday Wednesday Thursday Friday Average STDDEV AVG + 2SD
Andy 10 8 30 15 13 15 7 29
Molly 15 29 55 33 90 44 24 92
Ryan 35 6 7 15 16 16 10 35
Sam 2 5 4 9 15 7 4 15
Ivan 9 1 3 5 0 4 3 9

Obviously, this is not what’s done in practice – there are better statistical tests and more revealing analysis that can be performed.

The more important point is that by looking at users or a collection of users within, say, the same Active Directory groups, UBA can more accurately find and escalate true anomalies.