Building a Global Ignore Filter
In a recent blog post (“Filtering out high volume traffic“) we talked about ways to stop capturing network traffic that’s 1) high volume, 2) only on a small number of ports and addresses, and 3) generally trusted, ignorable, and unlikely to be malicious. Stripping these high volume flows out of your captured packet stream reduces processing and memory requirements when analyzing them and storage needs, all with a very low chance of missing something important. This can be a big performance boost when you’re either using an in-depth packet analyzer and/or listening to a network connection with a huge amount of traffic; this filtering can make the difference between successfully keeping up with, or dropping an unacceptable number of packets.
If you haven’t already read that post, this would be a good time to read it as we won’t re-cover the core concepts here.
That blog covered these high-volume streams one at a time; we want to go to the next level and build up a global filter of the majority of your high-volume traffic. This will allow us to strip out a large percentage of the packets in all your packet capture tools.
Building a Global Ignore Filter
Our filter will be describing known good traffic – high volume traffic flows that we’re reasonably confident are both expected and unlikely to be malicious. To use this file most easily, the filter will be essentially this:
( not ( (udp and ( ((port firstport) and (host hostA)) or ((port secondport) and (host hostB or hostC)) or ((port thirdport) and (host hostD or hostE)) or ((port fourthport) and (host hostF)) ... )) or (tcp and ( ((port fifthport) and (host hostA)) or ((port sixthport) and (host hostB or hostC)) or ((port seventhport) and (host hostD or hostE)) or ((port eighthport) and (host hostF)) ... )) ))
Each ((port….) and (host….)) line describes a particular traffic flow that we no longer wish to process because we’re confident it’s benign. We can use any mix of “host a.b.c.d” and “net x.y/16” style blocks.
Note that the file has “not” surrounding everything else. Why? When I start up a tool using this filter, I immediately tell it “I want you to process everything except these high-volume trusted flows.” That means my tools never even see these high-volume flows, allowing them to focus instead on the remainder, freeing up a lot of processing power and memory to focus on less trustworthy traffic.
Working With the File
It’s helpful to have a fixed location for the most current production master filter file. This allows us to reference one file in any scripts that start packet capture tools. I’m using
, so I’ll need to create that directory first with
mkdir -p $HOME/med/bpf/
(Note that I use “$HOME” instead of “~”; $HOME is more likely to work in shell scripts.)
Now I can edit that file by opening “$HOME/med/bpf/global-ignore.commented.bpf” in my favorite editor (see Template if you’re starting from scratch). The editor should save the file with Unix linefeeds instead of the Windows CR/LF. Also, it’s really helpful if that editor recognizes parentheses pairs so that when your cursor is on a parenthesis the editor highlights the other end. This can help you recognize an easy typo of mismatched parentheses. It’s still worth typing carefully – a single typo can stop the sniffer program from running at all.
Comments in the File
Unfortunately, BPF filter expressions have no place for comments. That’s a shame – a long BPF filter that addresses lots of traffic types and destinations becomes almost unmanageable without them (“Just what was this address?”, I ask myself a year after it was added…).
We’ll step around that by adding them anyways, and we’ll remove them just before the filter is used. To do that, all comments will start with a “#” character, and this must be the first character on the line. Here’s an example of a few commented lines in our file:
#Make sure all comments start with a # in the far left column ( not ( (udp and ( #VPN traffic ((port firstport) and (host hostA)) #Videoconferencing to provider2 or ((port secondport) and (host hostB or host hostC)) #More videoconferencing or ((port thirdport) and (host hostD or host hostE)) #API requests and replies to hostF or ((port fourthport) and (host hostF)) ... )) or (tcp and ( #Windows patching ((port fifthport) and (host hostA)) #Overnight backups or ((port sixthport) and (host hostB or host hostC)) #SQL Replication links or ((port seventhport) and (host hostD or host hostE)) #Traffic to main web server or ((port eighthport) and (host hostF)) ... )) ))
This makes all the difference in the world for maintainability. In addition to just the primary reason for the filter, your comments can include a Datestamp, the initials of the person adding it, and/or a link to a ticket tracking the addition.
To learn what traffic types (ports and IP addresses) might be good candidates to include in this file, please refer back to the “Filtering out High Volume Traffic” blog post. That post will walk you through identifying these streams so you can add them to this file.
Removing the Comments for Use by a Program
Since our programs don’t like the comments, we remove them like this:
tcpdump -qtnp -i eth0 "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
tcpdump wants its filter as the last thing on the command line, surrounded by quotes. Inside the quotes, we use $(……..) to say “Run the program inside the parentheses. Whatever that program spits out, put that text here (inside the quotes).” The grep command reads the entire file and throws away (the “-v” option) all lines that start with a “#” symbol (The caret in ‘^#’ says the # must be at the beginning of the line). The output of the grep command will end up being “all lines that do not start with #”.
The end effect of this is that the command we end up running looks like this
tcpdump -qtnp -i eth0 "( not ( (udp and ( ((port firstport) and (host hostA)) or ((port secondport) and (host hostB or hostC)) or ((port thirdport) and (host hostD or hostE)) or ((port fourthport) and (host hostF)) )) or (tcp and ( ((port fifthport) and (host hostA)) or ((port sixthport) and (host hostB or hostC)) or ((port seventhport) and (host hostD or hostE)) or ((port eighthport) and (host hostF)) )) ))"
Note that we must use double quotes around the $(…..) because we want to run a program to spit out the filter to use. If we’d used single quotes, we would have handed the literal string
$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)
to tcpdump as its filter, which isn’t what we wanted.
Testing That the Filter Works
First, let’s check that the filter file is formatted correctly. We’ll try to read a sample pcap file with the filter:
$ tcpdump -qtnp -r sample.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)" >/dev/null
If there’s a formatting error, tcpdump will report that it was unable to apply the filter and exit. If all is good, you’ll be returned to a command prompt.
Now, let’s see if the filter successfully removes traffic. We’ll assume sample.pcap contains at least some high-volume traffic that your filter will discard. First, process the pcap file and see how many lines of output are shown – the number of packets processed:
tcpdump -qtnp -r sample.pcap | wc -l reading from file sample.pcap, link-type EN10MB (Ethernet), snapshot length 262144 123664
In our example, tcpdump was handed 123,664 packets and showed a one-line summary of each (“wc -l” shows the number of lines seen on input).
Now, reprocess that same pcap file, but apply the filter:
tcpdump -qtnp -r sample.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)" | wc -l reading from file sample.pcap, link-type EN10MB (Ethernet), snapshot length 262144 83428
Now tcpdump was only handed 83,428 packets. By applying the filter, libpcap read all packets out of the sample.pcap file, but only handed 83,428 of them up to tcpdump to process. The “missing” 40,236 packets were discarded by the filter and were never given to tcpdump to process at all. Because BPF is so efficient, the second tcpdump will take less time to complete than the first. The time difference becomes even more noticeable when you use a less efficient sniffer program or process larger pcap files.
Using the Global Ignore Filter
General Use by libpcap Applications
The packet capture library used by all Linuxes and some other operating systems is called libpcap. With very few exceptions, almost every packet sniffer used in Linux and MacOS calls down to this one library to do the actual packet capture. Because libpcap (and the operating system kernel) handle this filtering internally, those sniffers all have the ability to apply a filter – like the one we’re building.
The filter is only read when the program first starts. That means that if we make a change to the filter file (global-ignore.commented.bpf), any programs we’ve already started will continue to use the old filter until we restart them.
We’ve already seen an example of running tcpdump with our filter:
tcpdump -qtnp -i eth0 "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
Note that using this filter is independent of all the other command-line options. It’ll work if we’re reading packets from a file:
tcpdump -qtnp -r file.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
, using different command-line options:
tcpdump -vvX -i eth0 "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
, or changing any of the other modes of operation. The BPF at the end just says “Do what you would have done before, but only inspect some of the packets – don’t even show me the ones discarded by the filter.”
pcap_stats (https://github.com/activecm/pcap-stats), the program we use to identify high-volume streams, will gladly use a filter. By building up a filter that contains all the high-volume streams we no longer wish to see, pcap_stats can both run more quickly (as it doesn’t need to see all packets) and also lets us focus on new high-volume streams (as we’ve already removed a significant number of them.)
To apply a filter, we use the “-b” command line parameter and place the filter – in double-quotes – immediately after:
pcap_stats.py -r sample.pcap -m 500 -b "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
Unlike tcpdump which wants the filter at the end of the command after all other options, pcap_stats will accept the filter anywhere on the command line as long as it follows the -b option. The following two examples work exactly the same way as the command above:
pcap_stats.py -m 500 -b "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)" -r sample.pcap pcap_stats.py -b "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)" -r sample.pcap -m 500
Most sniffing programs work like either tcpdump (that wants the filter at the end of the command line in quotes) or pcap_stats (that has a command-line parameter to say “the next thing on the command line is the filter to use.”) Ngrep uses tcpdump’s approach:
ngrep -q -I sample.pcap 'GET' "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
dnstop, passer, and tshark identify the filter with a command-line option:
dnstop -b "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)" sample.pcap
passer -r sample.pcap -b "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
tshark -r sample.pcap -f "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
The man page for the sniffer (like “man ngrep”) will commonly describe how to apply a filter; search down for “filter”, “bpf”, or “BPF”.
While BPF is an industry-wide standard for packet capture and processing, you should be aware that tshark (and its graphical big brother, Wireshark) can use BPFs for capture, but not for display. They use “display filters” for deciding what to show in an already-captured set of packets. There’s an intro to this on our blog at https://www.activecountermeasures.com/tshark-examples-theory-implementation/ and some more detail at https://www.wireshark.org/docs/man-pages/wireshark-filter.html.
When a Program Isn’t Run From the Shell or Needs a Static Filter String
If we don’t have a shell script in which we can run that handy “grep -v …” command, we’ll have to create the filter manually. Run the following command:
grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf | tr '\n' ' ' >$HOME/med/bpf/global-ignore.bpf
This removes the comments just like we’ve done in the past, and saves just the BPF part to a different file – global-ignore.bpf . We can now place the contents of this file anywhere a BPF is expected.
Remember to 1) run the above command anytime you change global-ignore.commented.bpf , and 2) load the new filter into any files that need it.
Let’s take a look at a program like this, Zeek.
Zeek is not usually intended to be run as a command-line application. For that reason, we must place the filter we want inside a configuration file called “zeekctl.cfg”.
That file can be in different directories depending on how it was installed. The easiest way to find it is to run:
sudo find / -name zeekctl.cfg 2>/dev/null
Once you’ve located it, edit it as root. We’re using nano for simplicity, but feel free to use your preferred editor:
sudo nano /full/path/to/zeekctl.cfg
Inside that file, look for a line that starts with:
If you have one, you’ll need to figure out how to merge your new global filter with the existing options already there. If you don’t have one, create the line:
Now you need to put the entire filter inside the pair of double-quotes. To do this, place your cursor in between them, press ctrl-r to read in a file, and use this as the filename: ~/med/bpf/global-ignore.bpf . Once read in, double-check that the entire filter (from zeekargs through the last double quote) is on a single line, then save your changes. To make the change take effect, you’ll need to run:
sudo zeekctl deploy
Any time you make a change like this it’s a good idea to wait a minute and then run
sudo zeekctl status
to make sure everything’s running right.
Paring Down Existing pcap Files
If you’re looking to save storage space used by pcap files, a quick way to reclaim a lot of space is to drop these high-volume flows. Here’s an example command
tcpdump -r src.pcap -w filtered.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
The -r (read) and -w (write) options would normally just copy all packets from src.pcap to filtered.pcap. Since we included a BPF on the command line, the high-volume flows will be discarded.
The space savings can be considerable (anywhere from 10% to 90%+), especially if you’ve had a chance to build up your global filter with common high volume flows.
If you later improve the filter with new flows, you can run it again on the filtered.pcap to try to remove even more. There’s no downside to running it again, other than using some disk bandwidth and processor time.
Windows Users and Line Endings
A line ending in a Windows text editor is 2 bytes: a carriage return (CR) and a linefeed (LF). Linux uses just a linefeed. If you edit the filter on Windows but use it on Linux – or the other way around – there’s a small chance your sniffer program may get confused if the filter has the wrong line endings. If you run into this there’s a simple fix; place the entire filter on one line by removing all line endings. Instead of using:
"$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf)"
"$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf | tr -d '\r\n')"
The “tr” command (“translate”) is told to look for both line endings anywhere in the filter and delete them (“-d”). This places the entire filter on a single line.
Where to Go From Here
Preserving the TCP Flags for IPv4
You may have one or more flows where it would be nice to discard the bulk of the traffic, but still keep the opening and closing packets (so you have a record that the connection existed at all.) This is possible with IPv4 TCP connections.
To do this, we’ll slightly modify the filter line. Instead of the original
or ((port eighthport) and (host hostF))
or ((port eighthport) and (host hostF) and (tcp & 0x17 == 0x10))
This modifies the filter to say “discard the packet if 1) it comes from or goes to port eighthport, 2) it is coming from or going to host hostF, and 3) only the ACK flag is turned on (out of SYN, FIN, RST, and ACK.)” Because of this change, SYN, SYN/ACK, FIN, FIN/ACK, RST and RST/ACK packets will be handed up to your sniffer.
This last block only works with TCP packets, obviously. Because of a limitation in BPF, it also unfortunately only works with IPv4 (not IPv6). Please don’t try to add it to a line with IPv6 addresses or networks; the filter won’t work. If you are filtering traffic of a certain kind that may go to either IPv4 or IPv6 addresses and want to use this filter, you’ll have to break the filter up into 2 lines; one for IPv4 with the TCP flag filter, and one for IPv6 without the TCP flag filter:
or ((port 443) and (host 18.104.22.168) and (tcp & 0x17 == 0x10))
or ((port 443) and (ip6 net bb02:3dfe:a123::/48))
Using the Global Filter With Additional BPF Requirements
If you want to discard these high-volume flows and use additional filters to narrow your view down to just some specific traffic, that’s reasonably straightforward. This will narrow the view down to just dns traffic to google’s public dns servers:
tcpdump -qtnp -r file.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf) and ((host 22.214.171.124 or host 126.96.36.199) and port 53)"
We use “and” to separate the global filter and this new filter. We also surround the new filter with parentheses.
Adding Some Traffic Back In
If you want to discard most of your high volume flows but want to see just a few of them, you can add them in just like above, but using “or”:
tcpdump -qtnp -r file.pcap "$(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf) or ((host 188.8.131.52 or host 184.108.40.206) and (udp port 8801 or udp port 8802))"
Those hosts are Zoom servers; this filter still discards most of your high volume traffic but sends that Zoom traffic up to your sniffer.
Seeing Just the High-Volume Traffic
Sometimes you might want to see just the high-volume traffic, perhaps for testing or statistics. That easy too; use “not”:
tcpdump -qtnp -r file.pcap "not ($(grep -v '^#' $HOME/med/bpf/global-ignore.commented.bpf))"
We put our original filter in parentheses to make sure it’s clear that we want to negate the whole filter.
We’ve supplied a template file as a starting point. It includes the basic structure as well as some sample filters for common services so you can see how the lines are formatted. It’s nowhere near complete – it’s intended to be a starting point for your own rules.
You can download this file from:
Related Blogs and Webcasts
Tools Mentioned in the Article
For more gory details on line endings, see: https://en.wikipedia.org/wiki/Newline.
Interested in threat hunting tools? Check out AC-Hunter
Active Countermeasures is passionate about providing quality, educational content for the Infosec and Threat Hunting community. We appreciate your feedback so we can keep providing the type of content the community wants to see. Please feel free to Email Us with your ideas!
Bill has authored numerous articles and tools for client use. He also serves as a content author and faculty member at the SANS Institute, teaching the Linux System Administration, Perimeter Protection, Securing Linux and Unix, and Intrusion Detection tracks. Bill’s background is in network and operating system security; he was the chief architect of one commercial and two open source firewalls and is an active contributor to multiple projects in the Linux development effort. Bill’s articles and tools can be found in online journals and at http://github.com/activecm/ and http://www.stearns.org.