Simplifying Beacon Analysis through Big Data Analysis
One of the unique differentiators of AI-Hunter is that we are constantly analyzing data in 24 hour hour blocks. Unlike other tools that simply look at a moment in time, AI-Hunter is scrutinizing all Internet traffic over an entire day. Given the amount of overhead involved with this type of an analysis, you may be wondering why we went down this path. In this blog post I’ll talk about why it’s so important to have the context of a full day’s worth of data when you are hunting for command and control activity.
Threat Hunting – A Quick Analogy
Think of threat hunting as being similar to reading a book. If I only read a single page of the book, I may still be able to derive some basic assumptions about the story. For example, there may be enough information to identify time period, genre, and a few characters. However I will miss all of the context that pulls the full story together. The one page is just a small piece of the story, not the complete picture. The same is true with network traffic. If I look at a single packet, or even a single session, I’m only getting part of the story. I can derive some assumptions, but I may miss the full context of the conversation.
Beacon Detection – A Case Study
As an example, let’s assume we have a Remote Access Trojan (RAT) that is calling home to its command and control (C&C) server every 60 seconds. Let’s further assume that in order to avoid detection, our attacker is varying the timing of the signals by +/- 50%. So between any two beacon signals, the sleep time may be as little as 30 seconds, or as long as 90 seconds.
IDS Detection of Beacon Activity
Now, let’s assume we are using a tool such as an IDS to detect malicious activity. An IDS will reassemble each individual session and try and spot suspicious activity. Because an IDS has no memory from one session versus another, it has no ability to identify that each session is contributing to overall beaconing behaviour. Unless there is something unique in every session for us to pattern match against, the beacon activity will go undetected because we are missing the context of all of the other sessions.
Small Time Window Detection of Beacon Activity
As another example, let’s say we are using a tool that has the ability to remember 10 minutes worth of session data. This gives the tool the ability to provide some context between sessions. Given the 30 to 90 second time variance, there may be as many as 20 or as few as 6 data points to work with in order to identify the beaconing behaviour. This is going to be extremely challenging, as normal traffic will appear in this range as well. For example, connecting to a Web site that supports HTTP 1.0 could easily cause 6-20 sessions to be created. This makes our chances of triggering false positives extremely high. So our choices are to ignore the alerts or put in place whitelists that may permit an attacker to sneak through.
Big Data Detection of Beacon Activity
Now let’s look at that same situation but with 24 hours worth of data. The larger time block gives us anywhere from 864 to 2,880 data points to work with. This gives us far greater fidelity to spot beacons as well as weed them out from potential false positives. In fact, it opens the door to doing some really cool data manipulation.
We mentioned that our attacker is varying the timing on the beacon signal between 30 and 90 seconds. If I take those beacon signals and organize them into 2 hour buckets, I can average out the variance. So a 30 second beacon and a 90 second beacon now averages out to two 60 second beacons. This means that I’m going to see about 120 beacon signals in each 2 hour bucket. If I plot those buckets over the 24 hour period, I end up with a fairly flat line at 120 beacons per bucket. That flat line is what tells me that I’m looking at a beacon versus normal user traffic. So despite the attacker taking steps to hide their tracks, we can normalize that out.
Performance With Big Datasets
I’m not going to lie, when we first went down this path we had issues with scale. Processing all of the data created by a 1 Gb link, even once per day, was a real challenge. Thanks to an amazing engineering team, we were able to optimize and scale up our analysis. The result is that we have customers pushing hundreds of millions of connections per day, and we can still keep up with analyzing that data in near real time. This lets us perform a deep analysis on huge datasets while still keeping up with high speed links.
Lessons Learned
Beacon traffic, on a per session basis, can look identical to normal traffic patterns. This makes them difficult to spot with most modern day security tools. By analyzing 24 blocks of packet data, it becomes much easier to identify beacons versus normal traffic patterns, even when an attacker tries really hard to hide their tracks. If you want to ensure you are catching beacons leaving your environment, you really need to be sifting through large batches of packet data.
Chris has been a leader in the IT and security industry for over 20 years. He’s a published author of multiple security books and the primary author of the Cloud Security Alliance’s online training material. As a Fellow Instructor, Chris developed and delivered multiple courses for the SANS Institute. As an alumni of Y-Combinator, Chris has assisted multiple startups, helping them to improve their product security through continuous development and identifying their product market fit.