Zeek Log Analysis Using Hacky Scripts
Zeek is a traffic analysis tool which can provide an accurate record of all packets flowing through a network. This makes it extremely useful for identifying command and control (C2) channels as well as data exfiltration. In this blog post I want to show you some of the scripts I use to automate the common data queries I perform when hunting through Zeek logs for suspicious traffic patterns.
Where to Get These Scripts
If you want a copy of the scripts I’m about to cover, you can download them from here:
https://thunt-level1.s3.amazonaws.com/beacon-scripts.tar.gz
Simply decompress the archive and copy the files to a directory in your path (usually one of the “bin” directories).
Tweaking the Scripts
Please note that the scripts are designed to work with Zeek logs that are in an uncompressed, CSV format. If your logs are compressed, you will want to edit the scripts and replace “cat” with “zcat”. If your logs are in JSON format, you will want to use zcutter instead of zeek-cut.
fq = Collecting DNS Information
The “fq” script is useful when you want to see what Zeek knows about a specific IP address or Fully Qualified Domain Name (FQDN). You can even cross reference this info across multiple files. Here’s a copy of the script:
echo 'DNS info' cat dns.* | zeek-cut answers query | sort | uniq | grep -Fw $1 echo 'HTTP info' cat http.* | zeek-cut id.resp_h host | sort | uniq | grep -Fw $1 echo 'TLS info' cat ssl.* | zeek-cut id.resp_h server_name validation_status | sort | uniq | grep -Fw $1
Note that it will accept a single IP address or FQDN as a variable. It will then cross reference that info across multiple files. Here’s an example of it’s use:
The fq script searches the DNS log files to see if the IP address was part of any DNS queries. This helps to identify what system the user was trying to access when they ended up connecting to this IP address. We then search the HTTP and HTTPS log files to see if there was any related activity with this IP address. In this case, there were no HTTP connections but there were HTTPS connections. Further, we can see the Server Name Indicator (SNI) field matches the DNS query, and the digital certificate checks out as “ok”. Since the DNS and SNI info match, and the digital certificate is valid, we can be relatively certain that the system name is legitimate.
The script also works when querying FQDNs:
In this case, the FQDN resolves to multiple IP addresses. Further, we can see that both HTTP and HTTPS (TLS) connections were made to this system.
Plotting Beacons
Imagine you are going through RITA data and you spot an entry similar to this one:
RITA is 100% certain that this connection pair is a beacon. While this is good to know, what if I want to see a map of the connection frequency between these two IP addresses over a 24 hour period of time? While AC-Hunter has that ability, RITA does not. Luckily we can solve this problem with a script! 🙂
beacon-conn
Here’s a copy of the beacon-conn script (note that all of these commands are on a single line):
cat conn.* | zeek-cut -d ts id.orig_h id.resp_h | grep $1 | grep $2 | sed 's/T/:/g' | cut -d ':' -f 2 | uniq -c | tr -s " " | awk '{ print $2 " " $1}'
We are extracting data from Zeek’s conn.log files. Conn.log has a record of every connection seen by Zeek, so regardless of the service or port being used we will have data in these logs that we can leverage. The script accepts two variables, which are the two IP addresses in the connection pair. The script then breaks out the number of connections that take place each hour and prints that to the screen. Here’s an example:
The first column is the hour of the day and the second column is the number of connections that took place during that hour.
Note that with only a few exceptions, we are seeing about 835 connections per hour. It’s because of this consistent frequency that RITA reported it was 100% certain this connection pair was a beacon. In fact, because we see the quantity vary a bit each hour, we know it was a beacon using jitter.
beacon-sni
The beacon-conn script is useful for analyzing IP based beacons, but what if the target is a FQDN located behind a CDN server? Luckily, we have a script for that as well! The beacon-sni script will use the SNI data in HTTPS connections as the external endpoint. This will let us map beacons over multiple IPs. As an example, let’s check out www.alexa.com which we identified earlier that maps to multiple IP addresses.
This script shows the number of connections per hour regardless of what target IP address was used. This takes all of the work out of trying to plot beacons through load balancers or CDNs.
beacon-host
If the connection pair is using HTTP, we can use the beacon-host script to map the connection pairs across multiple IP addresses. Here’s an example:
Summary
It should be noted that none of the scripts I covered here needed any real programming ability. I simply took the command that I would normally run from the terminal, put it inside an easy to remember script name, and then added variables so that different IPs and FQDNs could be passed to the command. With this in mind, if you have commands that you frequently use as part of your analysis, consider using a similar process to automate them!
Chris has been a leader in the IT and security industry for over 20 years. He’s a published author of multiple security books and the primary author of the Cloud Security Alliance’s online training material. As a Fellow Instructor, Chris developed and delivered multiple courses for the SANS Institute. As an alumni of Y-Combinator, Chris has assisted multiple startups, helping them to improve their product security through continuous development and identifying their product market fit.