# Unpacking Banking Trojan Gozi

<figure><img src="/files/GxhlfHOJh25pky8aeqjR" alt=""><figcaption></figcaption></figure>

Hi all,\
this report provides a technical description of unpacking a <kbd>Gozi banking trojan</kbd> sample, part of Zero2Automate course's biweekly challenges.

Basically, a banking trojan's goal is to steal sensitive information regarding (online) banking services, for examples usernames and passwords. To reach this goal, banking trojans apply various techniques like keylogging, screen capture, form grabbing or injecting malicious code into browser sessions, to extract cleartext information before encryption and transfer takes place.

This sample employs several techniques, to thwart analysis, especially static analysis approaches.\
For example, this malware needs to be unpacked during multiple stages, it employs string encryption with a custom encryption algorithm or loads additional APIs during runtime.

The analyzed sample has the SHA256 hash

> 0a66e8376fc6d9283e500c6e774dc0a109656fd457a0ce7dbf40419bc8d50936

and can be found on [Malware Bazaar](https://bazaar.abuse.ch/sample/0a66e8376fc6d9283e500c6e774dc0a109656fd457a0ce7dbf40419bc8d50936/).

{% hint style="info" %}
This analysis was conducted in a controlled environment using static and dynamic techniques to safely observe the malware’s behavior and dissect its components. To ensure safety during the analysis, a simulated internet connection was used instead of a real one, so nothing I did could connect back to the real world.
{% endhint %}

***

## Packed sample

### Triage

{% tabs %}
{% tab title="SHA 256" %}
0a66e8376fc6d9283e500c6e774dc0a109656fd457a0ce7dbf40419bc8d50936
{% endtab %}

{% tab title="Interesting Strings" %}
VirtualAlloc\
ntdll.dll\
kernel32.dll\
LdrGetProcedureAddress\
kernel32.dll\
VirtualAlloc\
ntdll.dll\
LdrGetProcedureAddress\
kernel32.dll\
duS2j\
ntdll.dll\
kernel32.dll\
VirtualAlloc\
ntdll.dll\
kernel32.dll\
overNitselfyears\
hSPseascalled,S5b\
beastyou.re.own\
fOyouwunderFitreedry\
F7dryWdcattle\
herbreplenishagl\
vUwingedCtogreaterA
{% endtab %}

{% tab title="Imports" %}

<figure><img src="/files/sJccNLKxHsW1Gj7PYaLF" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Packed executable" %}
High entropy sections\
![](/files/1fM3hm70LCrEet8HyU5t)

\
Packer signatures\
![](/files/2gayLiA6oDG8r3595o0n)
{% endtab %}

{% tab title="Conclusion" %}

* 32 bit DLL
* packed executable
  * <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd> to allocate memory for unpacked code?
* decoded APIs not visible in Import Address Table
  * dynamic API resolution via <kbd><mark style="color:orange;">LdrGetProcedureAddress<mark style="color:orange;"></kbd>?
    {% endtab %}
    {% endtabs %}

### String Decryption

During static analysis, we are able to recognize several calls to the same function, responsible for decrypting an array of bytes, moved onto the stack before these calls.

<figure><img src="/files/WNczNRyFwvavBi9k3Bwx" alt=""><figcaption><p>calls to string decryption function</p></figcaption></figure>

Generally, this function takes three parameters everytime it is called: a destination buffer for the decrypted string, a source array with encrypted bytes and a length, determining how often a loop in the decryption function will be executed.

<figure><img src="/files/DYevqefJxJSauEQcDqe4" alt=""><figcaption><p>parameters for function call</p></figcaption></figure>

Before the actual decryption runs, this function adds some extra features, like looping, incrementing a counter and marking a specific byte in an array, acting as the key.

<figure><img src="/files/6fdkq1zdAbiyPwoY9atX" alt=""><figcaption><p>decryption function, acting as a wrapper</p></figcaption></figure>

The decryption algorithm simple takes one encrypted byte and substracts the key (another byte) from it and moves the result into a buffer.

<figure><img src="/files/WSF2AcixB0BMuigB5Ufj" alt=""><figcaption><p>decryption algorithm</p></figcaption></figure>

This is the script I created (with some help from ChatGPT), to extract the encrypted bytes of the raw binary and decrypt them with a re-implementation of the algorithm.

{% code expandable="true" %}

```python
import pefile, struct

pe = pefile.PE("gozi.dll")

# find offset to key
key_va = 0x40DBF6
key_rva = key_va - pe.OPTIONAL_HEADER.ImageBase
key_offset = pe.get_offset_from_rva(key_rva)


# -------------- find calls to decryption function --------------
# find offset of decryption function
decryption_va = 0x40BDF0
decryption_rva = decryption_va - pe.OPTIONAL_HEADER.ImageBase
decryption_offset = pe.get_offset_from_rva(decryption_rva)

# locate .text section
for section in pe.sections:
    if b'.text' in section.Name:
        text_data = pe.__data__[section.PointerToRawData:section.PointerToRawData + section.SizeOfRawData]
        text_va = pe.OPTIONAL_HEADER.ImageBase + section.VirtualAddress
        text_rva = text_va - pe.OPTIONAL_HEADER.ImageBase
        text_offset = pe.get_offset_from_rva(text_rva) # = 0x1000


# find all calls to decryption routine
calls = []
for i in range(len(text_data)):
    if text_data[i] == 0xE8: # opcode call
        rel = struct.unpack_from("<i", text_data, i + 1)[0]
        call_offset = text_offset + i
        target = call_offset + 5 + rel

        if target == decryption_offset:
            calls.append(call_offset)

'''
use found calls, to extract cipher and length (moved onto stack before call to decryption function)
cipher => ESP+0x4, but LEA'ed some instructions before
lenght => ESP+0x8 for every call to decryption function
'''
# find length of specific cipher text

lengths = []
for call_offset in calls:
    start = call_offset - 32
    for i in range(start - text_offset, call_offset - text_offset):
        if text_data[i:i+4] == b'\xC7\x44\x24\x08': # opcode mov [ESP+0x8]
            length = struct.unpack_from('<I', text_data, i+4)[0]
            lengths.append(length)


# find ciphers with corresponding lengths
# https://www-user.tu-chemnitz.de/~heha/hs/chm/x86.chm/x64.htm#Registers
# https://www-user.tu-chemnitz.de/~heha/hs/chm/x86.chm/x64.htm#ModRM
LEA_ABS_MODRM = {0x05, 0x0D, 0x15, 0x1D, 0x2D, 0x35, 0x3D}
cipher_offset_array = []
for call_offset in calls:
    start = call_offset - 96
    for i in range(start - text_offset, call_offset - text_offset):
         # opcode LEA and MODRM destination register
        if text_data[i] == 0x8D and text_data[i+1] in LEA_ABS_MODRM:
            cipher_va = struct.unpack_from('<I', text_data, i+2)[0]
            cipher_rva = cipher_va - pe.OPTIONAL_HEADER.ImageBase
            cipher_offset = pe.get_offset_from_rva(cipher_rva)
            cipher_offset_array.append(cipher_offset)
            

ciphers = []
for cipher_offset, length in zip(cipher_offset_array, lengths):
    cipher = pe.__data__[cipher_offset : cipher_offset + length]
    ciphers.append(cipher)           

        
def decrypt_string(cipher):
    plain = bytearray()
    for i in range(len(cipher)):
        key_marker = (i & 0xF) + key_offset
        key =  pe.__data__[key_marker]
        plain.append((cipher[i] - key) & 0xFF)
    print(cipher + b" ----- " + plain)


for cipher in ciphers:
    decrypt_string(cipher)
```

{% endcode %}

<figure><img src="/files/F03iwdZNXaqXtfMicWxn" alt=""><figcaption><p>decrypted strings</p></figcaption></figure>

The decrypted strings match the strings automatically decoded by <kbd><mark style="color:yellow;">floss<mark style="color:yellow;"></kbd>. Generally, these APIs can be used to load additional APIs during runtime (via <kbd><mark style="color:orange;">LdrGetProcedureAddress<mark style="color:orange;"></kbd>) and to allocate additional regions of memory (via <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd>) to store unpacked code.

{% code title="<http://undocumented.ntinternals.net/index.html?page=UserMode%2FUndocumented%20Functions%2FExecutable%20Images%2FLdrGetProcedureAddress.html>" %}

```c
NTSYSAPI 
NTSTATUS
NTAPI

LdrGetProcedureAddress(
  IN HMODULE              ModuleHandle,
  IN PANSI_STRING         FunctionName OPTIONAL,
  IN WORD                 Oridinal OPTIONAL,
  OUT PVOID               *FunctionAddress );
```

{% endcode %}

After the API, for example <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd>, has been decrypted, several additional function will be called to decrypt the name of the corresponding DLL (in case of <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd>, <kbd><mark style="color:orange;">kernel32.dll<mark style="color:orange;"></kbd> will be decrypted) and to dynamically load the API via <kbd><mark style="color:orange;">LdrGetProcedureAddress<mark style="color:orange;"></kbd>.

<figure><img src="/files/f8KsFIyrkfFiCN8lYTgr" alt=""><figcaption><p>procedure to load additional APIs</p></figcaption></figure>

### Unpacking

Since the usage of <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd> has been discovered, a debugger like <kbd><mark style="color:yellow;">x32dbg<mark style="color:yellow;"></kbd> can be used to let the malware perform the unpacking procedure. Afterwards, the mapped next stage's executable can be dumped, rebased and analyzed statically.

The goal is to place a strategic breakpoint on calls to <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd>. This allows us, to observe the allocated memory regions during execution for signs of executable memory.

#### First hit

The first hit on <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd> happens right after the decryption of this API since it is immediately called after loading it.&#x20;

<figure><img src="/files/7gI2tPkbtjIZmQjFPuaH" alt=""><figcaption><p>usage of allocated space</p></figcaption></figure>

The bytes written into this location are then decrypted via the <kbd><mark style="color:$warning;">RC4<mark style="color:$warning;"></kbd> algorithm. The key used for the algorithm is <kbd><mark style="color:$warning;">26 8A 11 8F 27 D4 F6 E0 70 A7 64 0E AA 4A EB 01 F1 A0 83 65 EA 23 B9 2F 01 9B DA 5A EF CD 0B 0C<mark style="color:$warning;"></kbd>.

After the decryption the malware continues with its procedure, until it hits <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd> a second time. This memory region allocated here plays a significant role later (see [Third hit](#third-hit)).

#### Second hit

Again, the API is decrypted and loaded via the above described procedure. The following call allocates a size of <kbd>0x6000</kbd> bytes and fills it immediately with bytes.

<figure><img src="/files/pMofYYMpMGh4qMaLBdhf" alt=""><figcaption><p>allocation and write into the 2nd memory region</p></figcaption></figure>

It becomes clear, that the region now contains some content looking like a valid PE file. After the next function call the content looks more like a correct PE file, cause the correct magic bytes for PE files are applied.

{% columns %}
{% column %}

<figure><img src="/files/4KHugMOnGw44gbe2uvg7" alt=""><figcaption></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/rg3B4X4UkPpWiUQ357vd" alt=""><figcaption></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

Later, the malware jumps via a return-oriented jump out of the function with the goal, to transfer execution to the allocated memory region via an indirect call.

<figure><img src="/files/yVLvWez6v1TWeYca51ip" alt=""><figcaption><p>indirect call to allocated memory region</p></figcaption></figure>

#### Third hit

The third hit of <kbd><mark style="color:orange;">VirtualAlloc<mark style="color:orange;"></kbd> occurs in the scope of the unpacked PE file. Again, a large region of memory has been allocated. To ensure that I don't miss any access to this region, I decided to set a hardware breakpoint at the beginning of this region, which triggers when accessed, regardless of whether it is read or write access.

<figure><img src="/files/PKLHPOQK9UKtoNYQlFfh" alt=""><figcaption><p>HW breakpoint hit, when writing a byte</p></figcaption></figure>

The bytes copied into this region come from the allocated region from the first hit (see above).

<figure><img src="/files/z10wQqa48IygQdEZ9Kha" alt=""><figcaption><p>procedure to copy bytes</p></figcaption></figure>

From here, I decided to let the malware do all the work and executed until return.

<figure><img src="/files/HSFcr7MI6Ee50ip3xM80" alt=""><figcaption><p>fully mapped executable</p></figcaption></figure>

#### Dump and rebase

To analyze the unpacked PE file statically, I dumped the whole region and rebased it with <kbd><mark style="color:yellow;">peBear<mark style="color:yellow;"></kbd>.

Afterwards we get a clean and readable PE file for further analysis.

{% columns %}
{% column %}

<figure><img src="/files/1zS5pmhwREYI0vnRt1Pz" alt=""><figcaption><p>before rebasing</p></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/zgNXpRGmdEYBhw0kpVpn" alt=""><figcaption><p>after rebasing</p></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

{% columns %}
{% column %}

<figure><img src="/files/Mgi0n6i6cSAy0ffb1ida" alt=""><figcaption><p>broken IAT</p></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/TRR3Fl6Z0wEbcCQpxpjz" alt=""><figcaption><p>repaired IAT</p></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

{% columns %}
{% column %}

<figure><img src="/files/STPwtiiov4ieEJlixcIy" alt=""><figcaption><p>broken Exports</p></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/b9s97EdlFaRWqATq7ZUP" alt=""><figcaption><p>repaired Exports</p></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

After unpacking, the malware continues to delete content from memory or to load additional DLLs (e.g. <kbd><mark style="color:orange;">advapi32.dll<mark style="color:orange;"></kbd>) via calls to <kbd><mark style="color:orange;">LdrLoadDll<mark style="color:orange;"></kbd>.

Highly notable are multiple calls to <kbd><mark style="color:orange;">VirtualProtect<mark style="color:orange;"></kbd>:\
the unpacked code copies itself into the allocated memory region and then takes a jump via <kbd><mark style="color:$success;">jmp eax<mark style="color:$success;"></kbd> to the entry point of the unpacked DLL (see screenshots below).

{% columns %}
{% column %}

<figure><img src="/files/1hB383GzTCgQ9j844Si4" alt=""><figcaption><p>jumping to...</p></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/VpM87EDDETCE4p0cMcjg" alt=""><figcaption><p>...DLL entry function of unpacked DLL</p></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

***

## Unpacked 2nd stage

### Triage

{% tabs %}
{% tab title="Imphash" %}
7c62ab7d5f2ed68e4989689e898c43c4
{% endtab %}

{% tab title="Imports" %}

<figure><img src="/files/OrUfWoqHroY9Bcy26oZO" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Exports" %}

<figure><img src="/files/YCWSWFwyBL4OgAglGtUb" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Interesting Strings" %}
era.dll\
SHLWAPI.dl\
USER32\
ABC\x00DEFGHIJK\x00LMNOPQRS\[T\xb1\xe0\`\xf6Z\x00abcdefgh\x00ijklmnop\x00qrstuvwx\x00yz012345\x026789+/
{% endtab %}

{% tab title="Capa" %}

<figure><img src="/files/yHhEWtfVczlYbcD7paYN" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Conclusion" %}

* 32 bit DLL
* high entropy section
* Dynamic API resolution?
  * run time linking
  * <kbd>USER32</kbd> and <kbd>SHLWAPI.dl</kbd> in strings
    {% endtab %}
    {% endtabs %}

### Analysis

From the unpacking phase we know which function is called after mapping the unpacked code into the memory region of the original, packed DLL.

#### First APC Injection

Shortly after the transfer of the execution flow to the unpacked code, the malware performs user space <kbd>APC Injection</kbd>.

<details>

<summary><a href="https://attack.mitre.org/techniques/T1055/004/">APC Injection</a></summary>

Threat actors use APC Injection to execute code within the process space of another process with the goal to evade detection or escalate privileges. Execution may be masked under a legitimate process.\
Therefore, malicious code is put into the APC queue of the targeted thread which is executed, when this thread enters an alertable state.

</details>

```c
DWORD QueueUserAPC(
[in] PAPCFUNC pfnAPC,
[in] HANDLE hThread,
[in] ULONG_PTR dwData
);
```

<figure><img src="/files/A2qh5D5IkoXRW3UU3syz" alt=""><figcaption><p>preparing APC injection</p></figcaption></figure>

At first, the malware creates a new thread for the <kbd><mark style="color:orange;">SleepEx<mark style="color:orange;"></kbd> function. Next, a function is put into the APC queue of the newly created thread. The parameters for the call to <kbd><mark style="color:orange;">SleepEx<mark style="color:orange;"></kbd>

```c
DWORD SleepEx(
  [in] DWORD dwMilliseconds, // 0
  [in] BOOL  bAlertable // 0 => FALSE
);
```

don't set the thread into an alertable state. However, <kbd><mark style="color:orange;">SleepEx<mark style="color:orange;"></kbd> [causes the thread to run in a suspended state](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-sleepex), which allows the malware to inject a malicious function via <kbd><mark style="color:orange;">QueueUserAPC<mark style="color:orange;"></kbd>. As soon as the thread starts, queued APC will be executed first, allowing the malware to execute additional procedures in a stealthy way. This procedure has similarities to the <kbd>Early Bird</kbd> injection technique described [here](https://www.cyberbit.com/endpoint-security/new-early-bird-code-injection-technique-discovered/).

<figure><img src="/files/7NLAk5MY5N5Lg5kvgsNd" alt=""><figcaption><p>call stack of created SleepEx thread after APC inejction</p></figcaption></figure>

#### Injected function

The malware uses the API <kbd><mark style="color:orange;">NtQuerySystemInformation<mark style="color:orange;"></kbd>. Notable here is the parameter <kbd>SystemProcessorPerformanceInformation</kbd>, which fills several <kbd>SYSTEM\_PROCESSOR\_PERFORMANCE\_INFORMATION</kbd> structs, based on the number of processors.

```c
typedef struct
_SYSTEM_PROCESSOR_PERFORMANCE_INFORMATION {
    LARGE_INTEGER IdleTime;
    LARGE_INTEGER KernelTime;
    LARGE_INTEGER UserTime;
    LARGE_INTEGER Reserved1[2];
    ULONG Reserved2;
} SYSTEM_PROCESSOR_PERFORMANCE_INFORMATION;
```

The <kbd>IdleTime</kbd> member of this struct is used for brute forcing a key, used for further decryption.

<figure><img src="/files/8SonvCtHnOj4kstP8Ltr" alt=""><figcaption><p>IdleTime member</p></figcaption></figure>

<figure><img src="/files/HYQzyNgbrLp76Bbg2W9U" alt=""><figcaption><p>brute forcing the key</p></figcaption></figure>

It doesn't take a long time for the malware, to brute force all 19 possible keys.

<details>

<summary>Explanation</summary>

Calculation modulo 0x13 (19 in decimal) returns 19 possible remainders, namely 0-18. Afterwards the status of the call to <kbd><mark style="color:orange;">NtQuerySystemInformation<mark style="color:orange;"></kbd> is added to this result, which is usually 0 after a successful call. Then 1 is added.

So, all values from 1 to 19 (0x01 - 0x13) are possible key candidates.

</details>

<figure><img src="/files/BFAKqv5duSPEVEmVYMSk" alt=""><figcaption><p>hit, when key = 0x13</p></figcaption></figure>

Python script to decrypt the strings.

{% code expandable="true" %}

```python
import pefile
import struct

# extract encrypted bytes and virtual address of .bss section
def extract_BSS_section(file):
    for section in file.sections:
            name = section.Name.decode().rstrip('\x00')
            if name == ".bss":
                return section.get_data(), section.VirtualAddress
                
'''
slices the initial key (hard coded date) into two dwords and calculates the key used for the first decryption step.
The key generated uses the number from the modulo calculation and the VA of the .bss section
'''
def generate_key(data, key, va, number):
    key_first_dword = struct.unpack("<I", key[0:4])[0]
    key_second_dword = struct.unpack("<I", key[4:8])[0]
    key = key_first_dword + key_second_dword + va + number - 1
    return key

'''
decryption routine. Uses the encrypted bytes and the generated key for decryption of the first DWORD, extracted from the .bss section.
Subsequent decryption steps use the encrypted DWORD of the previous step to calculate the key.
'''
def decryption(encr, key):
    dword_prev = 0
    decrypted = b""
    for i in range(0, len(encr), 4):
        
        dword = struct.unpack("I", encr[i:i+4])[0]
        manipulator = (dword_prev - key) & 0xffffffff
        dword_prev = dword
        decrypted += struct.pack("I", (dword + manipulator) & 0xffffffff)

    # pretty print output
    for s in decrypted.split(b'\x00'):
        clean = s.replace(b'\x00', b'').decode('utf-8', errors='ignore').strip()
        if len(clean) > 2 and clean.isprintable():
            print(clean)



key_init = b"Apr 26 2022"
filename = pefile.PE("dump2_rebased.bin")

# save return values into variables
encrypted_Bytes, bss_VA = extract_BSS_section(filename)

# loop through all possible number and brute force decryption
for number in range (1, 20):
    print("\n\nNumber: ", hex(number), "\n---------------------")
    key_new = generate_key(encrypted_Bytes, key_init, bss_VA, number)
    decryption(encrypted_Bytes, key_new)


```

{% endcode %}

<figure><img src="/files/LzMjILxiJwTh0OehJuPT" alt=""><figcaption><p>decryption script in action</p></figcaption></figure>

After the successful decryption has been verified by the malware itself, the malware copies the decrypted strings back into the <kbd>.bss</kbd> section, to make them accessible for the further course of events (e.g. dynamic API resolution during [second APC injection](#second-apc-injection)).

{% columns %}
{% column %}

<figure><img src="/files/azKgm9mkrlchc5M7lyUh" alt=""><figcaption><p>.bss section with encrypted strings</p></figcaption></figure>
{% endcolumn %}

{% column %}

<figure><img src="/files/qaoKsTskIf3vmv2xWZFH" alt=""><figcaption><p>.bss section with decrypted strings</p></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

#### Second APC Injection

{% hint style="info" %}
At this stage of analysis, I decided to focus on a more efficient approach and go for the low-hanging fruits.
{% endhint %}

After the malware decrypted the strings, it continues to establish a second APC injection, identical to the one performed before. However, the function executed is a different one.

The goal of this injected function and the subsequent function calls is to decrypt a next stage DLL in order to [load it reflectively](https://attack.mitre.org/techniques/T1620/).

Therefore, the malware accesses the <kbd>.reloc</kbd> section and the huge blob of encrypted bytes stored there. Later, the malware uses the decrypted strings stored in the <kbd>.bss</kbd> section to resolve several APIs during runtime. These are used to map the decrypted DLL into allocated memory.

<figure><img src="/files/PSTaezPLBeeOnoDY2Fjh" alt=""><figcaption><p>decrypted strings used for dynamic API resolution</p></figcaption></figure>

<figure><img src="/files/TCh3o6BHsSkwrARF2uqR" alt=""><figcaption><p>creating memory section for DLL</p></figcaption></figure>

The malware continues to access strings within this memory section with the goal to load specific libraries (e.g. <kbd><mark style="color:orange;">ntdll.dll<mark style="color:orange;"></kbd>, <kbd><mark style="color:orange;">oleaut32.dll<mark style="color:orange;"></kbd> and <kbd><mark style="color:orange;">kernel32.dll<mark style="color:orange;"></kbd>) and APIs from these libraries as preparation before transferring execution to the loaded, malicious DLL.

<figure><img src="/files/fWen1bZkN9F78bYW6G21" alt=""><figcaption><p>accessing strings within mapped DLL</p></figcaption></figure>

<figure><img src="/files/K0BGYqZs6n23j41G2p9D" alt=""><figcaption><p>transferring execution to mapped DLL</p></figcaption></figure>

***

## Final payload DLL

{% hint style="info" %}
Although not requested by the initial exercise, I decided to take a brief look into the supposed final payload as far it was possible in an strictly isolated environment with a running internet simulation.

This chapter does not claim to be comprehensive!
{% endhint %}

### Triage

{% tabs %}
{% tab title="Imphash" %}
0d41e840891676bdaee3e54973cf5a69
{% endtab %}

{% tab title="Interesting strings" %}
\<Various APIs and DLLs>\
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/\
Apr 26 2022
{% endtab %}

{% tab title="Imports" %}

<figure><img src="/files/k6TH7jvZoDt0iPoEcs9J" alt=""><figcaption></figcaption></figure>

A lot of Imports are taged <kbd>delay-loaded</kbd>, meaning they will be loaded during runtime.
{% endtab %}

{% tab title="Crypto" %}

<figure><img src="/files/D1DqpKJRol1XvKye21O2" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Entropy" %}

<figure><img src="/files/cKVpI8pWwRx6ET09Fnwk" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Communication" %}

<figure><img src="/files/TKS53Ky871m7QASDQ8pM" alt=""><figcaption></figcaption></figure>
{% endtab %}

{% tab title="Conclusion" %}

* Huge IAT
  * delay-loaded APIs
* high entropy sections
* multiple Crypto APIs and algorithms
* APIs from <kbd>wininet.dll</kbd> ⇒ network connections possible
  * receiving files from the internet
* a lot of interesting capabilities discovered by <kbd><mark style="color:yellow;">capa<mark style="color:yellow;"></kbd>
  {% endtab %}
  {% endtabs %}

I dumped, rebased and resized the DLL loaded reflectively multiple times. This obviously resulted in different SHA256 hashes for each approach. On the other hand, the <kbd>Imphash</kbd> is the same for every approach I took. It can be found on [Malware Bazaar](https://bazaar.abuse.ch/sample/413cf6a694eef7a4f1725a11938f1ab2df1957bfb3bf20cf6a47017bebbad2a9/).

***

### Analysis

#### String Decryption

Shortly after transferring execution to the DLL, the malware calls a function responsible for decrypting encrypted strings stored in the <kbd>.bss</kbd> section at <kbd>offset+0x4b89</kbd>. Similar like before, the malware brute forces the decryption, but this time it generates a semi-random number via a call to <kbd>GetSystemTimeAsFileTime</kbd> and later calculating this number with members of the <kbd>FILETIME</kbd> structure. Basically, the <kbd>\_aullrem</kbd> instruction is a modulo calculation. Afterwards, the malware searches for the <kbd>.bss</kbd> section and decrypts its content.

<figure><img src="/files/Owo8VGNqw9anmE7RH6vZ" alt=""><figcaption><p>calculation of a semi-random number</p></figcaption></figure>

<figure><img src="/files/WSN4ZFgbTEuk7gQEQ56y" alt=""><figcaption><p>decryption routine</p></figcaption></figure>

The decrypted strings can be found below. Some of them are quite interesting, like:

* various format strings (possibly to fill in gathered system info)
* Registry keys (possibly used for Persistence)
* DLLs and APIs to be loaded dynamically
* commands (e.g. cmd.exe)
* execution through [<kbd>mshta.exe</kbd>](https://attack.mitre.org/techniques/T1218/005/) (possibly downloaded next stage?)

{% code lineNumbers="true" expandable="true" %}

```
invalidcert
overridelink
%08X-%04X-%04X-%04X-%08X%04X
KERNEL32.DLL
uValue
NTDLL.DLL
GetStringValue
ZwWriteVirtualMemory
LoadLibraryA
GetDWORDValue
CreateKey
hDefKey
ReturnValue
root\default
ZwWow64QueryInformationProcess64
.bin
StdRegProv
LdrUnregisterDll
Notificationform
SetStringValue
/images/
version=%u&soft=%u&user=%08x%08x%08x%08x&server=%u&id=%u&type=%u&name=%s
SetDWORDValue
GetBinaryValue
&action=%08x
SetBinaryValue
Content-Disposition: form-data; name="upload_file"; filename="%s"
DeleteKeys
SubKey
Names
ValueName--%s%ssValue--%s--
https://__ProviderArchitecture
POST
%systemroot%\system32\control.exe
{%08X-%04X-%04X-%04X-%08X%04X}
ZwSetContextThread
RtlNtStatusToDosErroropen
ZwWow64ReadVirtualMemory6464
ZwProtectVirtualMemory
%02u-%02u-%02u %02u:%02u:%02u
ZwGetContextThread
Mozilla/40 (compatible; MSIE 80; Windows NT %u%u%s)
kernelbase
%S%x
NTDSAPI.DLL
LdrRegisterDllNotification
S:(ML;;NW;;;LW)D:(A;;0x1fffff;;;WD)(A;;0x1fffff;;;S-1-15-2-1)(A;;0x1fffff;;;S-1-15-3-1)
%c%02X%s=%s&soft=%u&version=%u&user=%08x%08x%08x%08x&server=%u&id=%u&crc=%x&uptime=%u&%ssize=%u&hash=0x%08x
http://%u%u%u
Content-Type: multipart/form-data; boundary=%sContent-Disposition: form-data; name="upload_file"; 
filename="%4u%lu
"Content-Type: application/octet-stream
GET
CreateProcessA
ZwMapViewOfSection
ZwCreateSection
ZwUnmapViewOfSection
ZwClose
jpeg
gif
bmp
avi
Software\AppData
Low\Software\Microsoft\Â© 
2020 Microsoft Corporation All rights reserved
%systemroot%\system32\c_1252nls\*dll0123456789ABCDEF.exe
Software\Microsoft\Windows\CurrentVersion\
Rundll
Local\
Global\
&ip=%s
rundll32 "%s",
%S&os=%s%u%u_%u_%u_x%u&tor=1&dns=%s&whoami=%s%08x%08x%08x%08x
runas
cmd.exe
/C "copy "%s" "%s" /y && rundll32 "%s",%S"/C "copy "%s" "%s" /y && "%s" "%s"
"Low\MicrosoftWow64
EnableWow64FsRedirection
IsWow64Process
D:(D;OICI;GA;;;BG)(D;OICI;GA;;;AN)(A;OICI;GA;;;AU)(A;OICI;GA;;;BA)
@CODE@
HKCU
HKLM
IE10RunOnceLastShown_TIMESTAMP
/C ping localhost -n %u && del "%s"
SOFTWARE\Microsoft\Windows NT\CurrentVersion
InstallDate
%S=new ActiveXObject('WScript.Shell');%SRun('powershell new-alias -name %S -value gp; new-alias -name %S -value iex; %S ([SystemTextEncoding]::ASCIIGetString((%S "%S:\%S")%s))',0,0);mshta "about:<hta:application><script>%S='wscriptshell';resizeTo(0,2);eval(new ActiveXObject(%S)regread('%S\\\%S\\\%s'));if(!windowflag)close()</script>"
IE8RunOnceLastShown_TIMESTAMP
SOFTWARE\Microsoft\Internet Explorer\Main
VersionCheck_AssociationsnoHost:
%APPDATA%
avast.
```

{% endcode %}

#### Network Connections

With tools like <kbd><mark style="color:yellow;">capa<mark style="color:yellow;"></kbd> I was able to narrow done interesting behaviour and API calls performed by the malware. For example, the malware tries to reach different C2 addresses using APIs from <kbd>wininet.dll</kbd>.

The malware tries to connect to three different hosts (see [IOCs](#iocs)) with an encrypted URI via <kbd>HttpSendRequestA</kbd>.

<figure><img src="/files/wFk4NeC4FTmXKGBCLjUp" alt=""><figcaption><p>HTTP connection</p></figcaption></figure>

<figure><img src="/files/racfDQ08ue9HP4dnfaDY" alt=""><figcaption></figcaption></figure>

Due to a strictly isolated environment, further execution after the connection attempts could not be observed.

***

## Detection

### Yara

Because the whole infection chain happened in memory, I decided to focus on in-memory decrypted strings to hunt for ongoing infections. These rules are not designed to scan files on disk!

{% code expandable="true" %}

```json
rule Gozi_refl_DLL_loading
{
	meta:
		author = "txc"
		description = "Looks for signs of Gozi infection in memory. This stage is responsible for loading the next stage DLL reflectively."
		date = "2016-01-04"
		reliability = 90
		imphash = "7c62ab7d5f2ed68e4989689e898c43c4"
		original_packed_DLL = "0a66e8376fc6d9283e500c6e774dc0a109656fd457a0ce7dbf40419bc8d50936"
		reference = "https://txc.gitbook.io/documentation/writeups/unpacking-banking-trojan-gozi"
	strings:
		$format_string1 = "%08X-%04X-%04X-%04X-%08X%04X" nocase ascii wide
		$format_string2 = "%02u-%02u-%02u %02u:%02u:%02u" nocase ascii wide
		$sd1 = "D:(D;OICI;GA;;;BG)(D;OICI;GA;;;AN)(A;OICI;GA;;;AU)(A;OICI;GA;;;BA)" nocase ascii wide
		$sd2 = "S:(ML;;NW;;;LW)D:(A;;0x1fffff;;;WD)(A;;0x1fffff;;;S-1-15-2-1)(A;;0x1fffff;;;S-1-15-3-1)" nocase ascii wide
	
	condition:
		all of them
}


rule Gozi_DLL_loader
{
	meta:
		author = "txc"
		description = "DLL loaded and execute by previous stage. Possibly downloads a next stage from C2."
		date = "2016-01-04"
		reliability = 50
		imphash = "0d41e840891676bdaee3e54973cf5a69"
		original_packed_DLL = "0a66e8376fc6d9283e500c6e774dc0a109656fd457a0ce7dbf40419bc8d50936"
		reference = "https://txc.gitbook.io/documentation/writeups/unpacking-banking-trojan-gozi"
	strings:
		$format_string1 = "version=%u&soft=%u&user=%08x%08x%08x%08x&server=%u&id=%u&type=%u&name=%s" nocase ascii wide
		$format_string2 = "&action=%08x" nocase ascii wide
		$format_string3 = "filename=\"%s\"" ascii wide
		$format_string4 = "soft=%u&version=%u&user=%08x%08x%08x%08x&server=%u&id=%u&crc=%x\x00&uptime=%u\x00&%s\x00size=%u&hash=0x%08x" nocase ascii wide

		$activeX = "new ActiveXObject('WScript.Shell')" nocase ascii wide
		$UA = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT %u.%u%s)" nocase ascii wide
		$mshta_execution = "mshta \"about:<hta:application><script>%S='wscript.shell'" nocase ascii wide
	
	condition:
		all of them
}

```

{% endcode %}

<figure><img src="/files/VD7p2qMG2CXhuIxOPcHP" alt=""><figcaption></figcaption></figure>

### IOCs

| Type           | Value                                                                       |
| -------------- | --------------------------------------------------------------------------- |
| IP:Port        | 185\[.]189\[.]151\[.]28:80                                                  |
| IP:Port        | 185\[.]189\[.]151\[.]70:80                                                  |
| Domain         | config.edge.skype.com                                                       |
| Request Method | GET                                                                         |
| User Agent     | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 10.0)                         |
| URI            | /drew/\<encryptedSystemInfo>/\*<mark style="color:$danger;">**.jlk**</mark> |

***

## Additional resources

* [Any.ru](https://app.any.run/tasks/e2b6e892-04b2-4d52-a38d-03776ab3e24c?p=676dd10fc87ff4bdf7f3b2e4)[n task](https://app.any.run/tasks/e2b6e892-04b2-4d52-a38d-03776ab3e24c?p=676dd10fc87ff4bdf7f3b2e4)
* [About banking trojans](https://www.huntress.com/threat-library/malware/gozi-malware)
* Kyle Cucci - Evasive Malware
* Michael Sikorski, Andrew Honig - Practical Malware Analysis


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://txc.gitbook.io/documentation/writeups/unpacking-banking-trojan-gozi.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
