read time = 3 minutes
The goal of this tutorial series is to show analysts a variety of methods to extract IOCs from malicious document samples as an alternative to a reliance on automated sandboxes. Sandboxes are valuable tools, but in many cases (with default settings) they may not provide full details and critical threat data could be missed. The following techniques may be slower than sandboxing, but they will provide a more complete picture of the malware’s operations. Here, I focus on the command and control (C&C) links for stage 1 downloads in documents that have been weaponized with encoded macros.
Emotet is used as the sample. SHA 256: 7cfe21d4f6b90c3ea7be27fef6dfc2f6f1cc5c41d5488ffbf1a14e9f43f19bc4
The Problem — Background
I was inspired to write this post a few weeks ago after chatting with a colleague about an incident response (IR) engagement they recently worked. The investigation was a pretty straight-forward incident that involved a possible near-miss for an Emotet infection. One of the first questions an incident responder might try to answer in these situations is, “did the loader connect with a web resource and download the payload?”
My friend explained how they were able to quickly close the case due to no evidence of execution. All they had to do was upload a sample of the dropper to their favorite sandbox and see what URL the loader tried to reach out to in order to download the next stage. So violà: they got confirmation of a nice, shiny IOC in the form of the malicious download link for the core Emotet executable.
They then searched host-based artifacts and proxy logs and determined no activity was observed on the host to that URL where the payload was staged. Case closed, right?
Sadly, in a case like with the Emotet malware, relying on an automated sandbox is too easy and data could be missed. The specific problem with Emotet is that the encoded macro in the dropper file always includes FIVE download URLs.
The VBA macro includes a PowerShell script that iterates through each hardcoded URL until it gets a valid response and downloads the payload. To extract all five from a sandbox, you would need to get lucky and basically get an invalid response or error from each of the first four URLs listed in the code (some tools such as any.run have a Fakenet feature, but it is not enabled in default configurations) . Here is an example from any.run, an awesome sandbox tool that misses 4/5 URLs with default settings because the first one in the code gets a good 200 OK HTTP response.
The Solution — Dynamic Analysis Techniques
We need a better way to get the data we seek. I don’t want to make it seem like I have a beef with sandboxes. I think they are great tools for the right situations with the right settings, and I use them frequently. However, one aspect of good threat intelligence is providing context so operational teams can make better decisions. In the case with malware such as Emotet, the critical situational awareness is knowledge of 5 URLs, and that IOCs should be extracted with alternative means. This is especially true in IR situations where specific host behavior must be ruled out to draw conclusions, and so more comprehensive solutions should be considered. In part 1 of this series, I will provide one solid option for quickly and easily grabbing the C&C download URLs we seek.
Option #1 — FakeNet-NG
FakeNet-NG is a network analysis tool developed by FireEye as a part of their FLARE reverse engineering project. A brief overview can be found here, and the tool itself can be downloaded from FLARE’s official GitHub repository here. FakeNet-NG is a dynamic analysis tool that will allow the researcher to run malware in a controlled environment. FakeNet-NG provides fake internet services so that the malware’s behavior can be observed without it actually reaching out to the real internet. This tricks the malware into believing it is connecting to the internet and it will reveal its network-based artifacts. This method does involve infecting the victim host.
The great advantage with a tool like FakeNet-NG is that we can essentially configure it to return HTTP 404 errors, which forces the PowerShell script to cycle through and exhaust all of the hardcoded C&C URLs.
Here’s how we do it:
FakeNet’s default HTTP listener configuration will have an option for “Webroot” which by default will point to the “\fakenet1.3\defaultFiles” directory. This folder contains several template files (exe, pdf, txt, html, etc) that it will serve in response to GET requests. Normally this is desired as it will trick the malware into thinking everything is O.K. and it will continue on its merry way with the next steps in the infection chain. In this specific instance with Emotet though, we don’t want this, but instead would prefer for no document to be served, forcing the full PowerShell loop and URL cycle. A quick edit to erase the “\defaultFiles” in the Webroot field in the config file will ensure that no document is served.
With the configuration all set, we’re ready to execute the maldoc to force the full PowerShell loop so we can grab all 5 URLs. Just run FakeNet-NG and then enable the macros (please understand that by doing this you are infecting your system and remediation will be necessary following analysis). This is the same document with the same hash as above in the sandbox example.
FakeNet-NG’s output window will show the network artifacts captured by the listeners. By adding the path from the GET request to the host, the full URL path can be reconstructed. (Side Note: the use of IP addresses instead of domain names is unusual and quite uncommon for this malware variant.)
In addition to the standard output, FakeNet-NG will also write to a pcap file which can be filtered (by http.request or whatever you prefer) and quickly searched in Wireshark to identify the indicators.
So that’s it for the FakeNet method. The primary benefits of this method is the relatively quick execution and comprehensive network artifact discovery. In this example with FakeNet-NG, we successfully recovered all 5 of the payload URLs. The fake network also offers a controlled environment and facilitates safe infection that does not allow the malware to escape the analysis environment and spread to any external hosts.
Unfortunately, the local host is infected in this option and a snapshot or backup will have to be restored post-analysis. This option also requires a bit of set up – a basic virtual or dedicated lab environment with preinstalled tools ready to go. A final drawback is that this method is a naturally manual process and does not scale well if the goal is bulk indicator extraction.
For the next post in this series, I will follow-up with some alternative methods to extract the same data, but that are perhaps even more labor intensive. In part 2 we will take a look at using CMD Watcher and CyberChef in tandem to achieve similar results. In part 3, I will show you how to decode the macros manually to extract the artifacts the old-fashioned way.