In this article we are going to take a look at a malicious HTML file sent to us by our internet friends. While the presentation and delivery of said file was pretty suspect at the onset, I thought it would make for good practice to see what was actually happening under the covers. Join me in my search for what lies at the bottom…
Pre-requisites
To play it safe I am running the following to make sure I don’t inadvertently infect my machine:
- VMWare Workstation Player – you can also use VirtualBox or whatever you like to use for running VMs
- Sandbox VM – I am using REMnux to kick the wheels on it. Some may see this as overkill, but you can download the OVA and start it without any pre-installation of tools. Easy wins.
- Text editor – VSCode is installed and I am familiar with it. This is cross-platform so it will work on Windows, Linux and Mac.
A garbled mess
Upon first inspection of the file I received titled invoice_remit-6187.xls.HTML looked to be a jumble of JavaScript. What is going on here?
The JavaScript in this document is using the unescape() function which simply stated decodes an encoded URL. It is taking all of the percent encoded text and decoding it. For example if you have a URL with am ampersand (&) in it, the URL would look like this:
- Unecoded – sample-site.com/terms&conditions
- Encoded – sample-site.com/terms%26conditions
Where %26 represents the ampersand character.
Now, why would they want to hide something in the JavaScript? While my aspirations of being a web developer were short lived, what I do remember is that document.write is taking the unsecaped text and writing it to the HTML. This all happens at runtime by your browser, all you have to do is open the file.
CyberChef Enters the Chat
There are lots of resources on CyberChef out there in the wilds of the internet and I urge you to read/watch as many as you can because this is a very useful (free) tool. Our use case is going to be to decode the JavaScript to see what our friends are trying to hide from us.
Since our VM isn’t connected to the internet, download the offline version of CyberChef (gchq.github.io). The download link is the top right-hand side. Copy this onto a USB drive that you can pass-through to your VM. (After doing this I realized REMnux comes with this pre-installed.)
Next, we drag URL Decode from the Favourites section over to the Recipe section.
Go back to the text editor, copy everything between the <Script type="text/javascript">
and </script>
sections and paste it into the Input section. If Auto-Bake is checked, it will automatically decode, otherwise hit Bake.
Ok, this is weird, it looks like we are starting to see some structure to this file in the form of CSS to format the page, but we still have a lot of obfuscation going on. Let’s take our Output and run it back through Input.
Maybe there is a better way, how about instead of copying and pasting the Output back into the Input field, we add a second URL Decode step to the recipe. Let’s see what happens.
If you have more than one operation in your recipe, the output of the previous step is passed onto the next one as it’s input. Great we can see the underlying code, but some of the URLs embedded aren’t quote decoded yet.
There it is this is about as decoded we can get using this recipe in CyberChef.
Stayed tuned for Part 2 where we will look at what this site is doing.