Analysing PCAPs with Bro/Zeek
Wireshark has always been my go-to for PCAP analysis. However recently I was exposed to the wonders of bro-cut, a fun little function of Bro IDS (now renamed to Zeek) that allows you to segregate PCAPs into Bro logs; http, dns, files, smtp and much more.
Not only this, but it makes analysing that much faster when you’re dealing with a very large network capture. This is the use case for when I’d start up my virtual machine (VM) as opposed to opening the file in Wireshark.
What is Bro/Zeek?
Bro is a network security monitoring (NSM) tool, which I like to think of as an advanced Intrusion Detection System; something that you might deploy for traffic inspection, detecting attacks, log capturing, and event correlation.
How do I run/install/use Bro?
Security Onion was my VM of choice as it already has Bro installed. Otherwise, you can find the package here: https://github.com/zeek/zeek, along with their webpage for further details: https://www.zeek.org/.
Because sharing is caring, here’s a resource I use quite often for when I forget all the different fields within a Bro log file: https://github.com/corelight/bro-cheatsheets/blob/master/Corelight-Bro-Cheatsheets-2.6.pdf.
How can I use it for CTFs or analysis?
That’s what this whole blog post is about. I’ll be going through the second part of the following CTF that @malware_traffic has created and shared on his website https://www.malware-traffic-analysis.net/2018/CTF/index.html. The twelve questions can be found at the bottom of the page. On the same page is a download link to the PCAP, which is called 2018-CTF-from-malware-traffic-analysis.net-2-of-2.pcap.zip.
I’ll be providing a detailed set of answers for each question, with some exploration of different linux tools for efficiently breaking down the data set. Each command and output will be highlighted using the quote tab like so:
input/command will be on the first line
output will be on the second line
First, to split the PCAP into Bro logs, use the following command:
bro -Cr infected.pcap
Running ls within the directory that contains this .pcap should show you something like this:
These are the log files that we’ll be working with going forward. Let’s start with the CTF.
Q1: What is the MAC address of the Windows client at 172.17.1.129?
IP addresses are assigned to a device (by their MAC address) via the DHCP protocol. Thus, for this question, we want to look at dhcp.log:
cat dhcp.log | bro-cut mac client_addr
The mac field within dhcp.log is the MAC addressed assigned to the device, and client_addr is the IP address of the client.
Q2: What is the host name for the Windows client at 172.17.1.129?
In the same log file, there is a field called host_name that will provide the hostname of the Windows client.
cat dhcp.log | bro-cut client_addr host_name | sort | uniq
Depending on the size of the PCAP, these logs could get quite large. Thus, I like to use the ‘sort’ and ‘uniq’ tools for searches such as this to only show me distinct and unique values.
Q3: Based on the Kerberos traffic, what is the Windows user account name used on 172.17.1.129?
The question alludes to Kerberos authentication, so we’ll want to be examining kerberos.log. Within this file we see many references to nalyvaiko-pc$, which mirrors what we saw in Question 2. However, if we read the question properly, it is asking for the Windows user account name, not the name of the PC.
The following command will break down the kerberos log to display exactly what we’re looking for:
Let’s go through this one step at a time.
- id.orig_h, client, service column filtering will ensure that we’re looking at the right IP address that was identified in Question 2, and we can distinguish between different authentication protocols by highlighting the service field
- awk ‘$3~”krbtgt”. Here I’m using a tool called awk to filter on the third column in the bro-cut set (which is ‘service’). The tilde (~) is used to find strings that contain the keyword ‘krbtgt’ (or Kerberos Ticket-Granting Ticket); in order words, only display kerberos authentication activity. If we wanted to do a complete match, we would have had to use double equal signs like this ‘$3==”krbtgt”’, but as you can see from the output above, there is more data contained within the full string
- grep -v is used to remove any entries that contain your specified keyword, and I’ve used the -i flag to make it case insensitive. I’ve removed references to ‘nalyvaiko-pc’ because we want the user account name, not the computer name.
Q4: What URL in the pcap returned a Microsoft Word document?
To obtain the URL we first need to know the filename of the document. This can be determined in a much simpler way by first searching files.log, which stores data on any file that was uploaded/downloaded during the time of the packet capture:
cat files.log | bro-cut mime_type filename | grep “msword”
Filtering on mime_type here really helps to cut down the amount of data we’re dealing with, once it’s processed by the grep command. Other PCAPs may be quite large compared to this, so I would suggest adding a further filter with awk to only look for ‘.doc’ as opposed to also retrieving results for ‘.docx’.
Discovering the URL is now straight forward as we can filter on the filename of the document. The Bro file http.log will have captured the URL, as so:
Q5: When did the URL happen? (date and time in UTC)
A: 2018–11–12 21:01:49
There’s a cool little feature of bro-cut which lets us mere mortals read bro timestamps a little easier. If we let it run normally, you’ll get the timestamp that you saw in Question Four: 1542056509.337937.
If we use bro-cut -d ts, it’ll convert the timestamp to human readable format:
cat http.log | bro-cut -d ts method host uri resp_filenames resp_mime_types | grep “2018_11Details_zur_Transaktion.doc”
2018–11–12T21:01:49+0000 GET ifcingenieria.cl /QpX8It/BIZ/Firmenkunden/ 2018_11Details_zur_Transaktion.doc application/msword
Q6: How many bytes is the Word document returned from that URL?
There is a field within the files and http log that capture the bytes traversing the wire. These fields are seen_bytes and response_body_len respectively.
Via the files log:
cat files.log | bro-cut -d ts mime_type filename seen_bytes | grep “2018_11Details_zur_Transaktion.doc”
2018–11–12T21:01:49+0000 application/msword 2018_11Details_zur_Transaktion.doc 73088
Via the http log:
cat http.log | bro-cut -d ts resp_filenames response_body_len | grep “2018_11Details_zur_Transaktion.doc”
2018–11–12T21:01:49+0000 GET 2018_11Details_zur_Transaktion.doc 73088
Q7: What is the SHA256 of the Word document returned from that URL? A:09ebe4229a74cdb1212671e6391742cc6bee387bf14da02974b07857b27f9223
Unfortunately, if you check files.log you’ll see that no associating hash was captured for this file. Instead I thought I’d provide a different method of obtaining this hash via Wireshark’s export objects functionality.
By going to Edit -> Find Packet, we’ll be able to search for the name of the file as a string within ‘Packet Details’, as shown below. To actually export this as an object, go to File -> Export Objects -> HTTP.
You’ll see that there is a file called Firmenkunden, which we discovered was part of the URL in Question 4.
Once you save this file, you can run the following command to determine the SHA256 hash:
Fun fact: if you need other hashes for filenames during a CTF, whether it be MD5, SHA1, SHA512 or otherwise, your linux terminal will have these installed for you, just by typing ‘sum’ at the end; i.e. md5sum, sha1sum.
In a Windows machine, you can use the Powershell utility ‘Get-FileHash’ with the flag -Algorithm to specify the hash function.
Q8: What URL in the pcap returned a Windows executable file?
Q9: How many bytes is the executable file returned from that URL?
A: timlinger.com/nmw/6169583.exe (Q8), 429056 bytes (Q9)
Having read most of this walkthrough by now, you can probably guess as to what fields we’d need to extract the answers to both of these questions using only one command:
cat http.log | bro-cut -d ts method host uri resp_filenames resp_mime_types response_body_len | awk ‘$6==”application/x-dosexec”’
2018–11–12T21:02:09+0000 GET timlinger.com /nmw/ 6169583.exe application/x-dosexec 429056
Using mime types is probably the quickest way for us to find the executable we’re looking for. If we run the following command, we can see what mime types have been captured within the pcap:
cat http.log | bro-cut resp_mime_types | sort | uniq
It’s obvious here that we should only consider files with the mime type of “application/x-dosexec”. We can use this keyword in an awk or grep command to display the executable file of interest, as I showed above.
Similarly to the previous questions, we can extract this information by analysing files.log as well.
Q10: What is the SHA256 of the Windows executable file returned from that URL?A:69e731afb5f27668b3a77e19a15e62cce84e623404077a8563fcf61450d8b741
We face the same problems here as Question 7, so this time I’ll demonstrate an alternative way of getting the hash for this file. Last time we used Export Objects from the Wireshark File tab. This time, we can navigate to the packet, and ‘Export Packet Bytes’ from the HTTP portion of the packet itself.
Search for the filename as we have previously, and within the ‘Data’ field of the packet, right click and select ‘Export Packet Bytes’.
Q11: What type of infection occurred in this pcap?
A: Phishing, Emotet Campaign
This is cheating, but if you run a Google search for ‘2018_11Details_zur_Transaktion.doc’, you’ll quickly find a link to this Hybrid Analysis report that flags it as an Emotet phishing document: https://www.hybrid-analysis.com/sample/f47f72f78711d3e125d5b81d5f433475a6c749aa724e19e25377ee3e83ff4a6c?environmentId=120.
You could have achieved the same result by searching for the domain, which was timlinger[.]com. Otherwise, I was able to quickly narrow it down by opening the document Firmenkunden.doc, within a virtual machine that had no network connectivity (I’ve used FLARE). From this view, I knew it was one of the many phishing campaigns out there today.
Running Strings on the document (check out the image below) also showed the obfuscated command embedded within the document that allows for the second stage malware download to occur. Doing a google search for any of those strings within the command will lead you to the Hybrid Analysis report I’ve linked above.
Q12: In addition to HTTP post-infection traffic, what other type of post-infection traffic is generated by the infected Windows host?
A: Encrypted SMTP traffic
Knowing that this is a phishing-based infection, a first point of investigation for me would be SMTP traffic. This quickly reveals to us that the traffic has been encrypted via TLS. It’s also communicating from numerous IP addresses over TCP ports 25 and 587.
That brings the CTF to an end! I hope you’ve learnt a new skill that you’re able to take with you during your next PCAP adventure, wherever that may be. Over and out.