Saturday, 20 October 2012

Analysis of TDL4

The Dropper

Our lab has recently got its hands on a new sample of TDL4, also known as TDSS.

The sample is likely distributed as a dropper file named outlkupd.exe; its file size 1,224Kb. Some of the components that it drops were compiled in July 2012, and some were compiled in September 2012 - so it's relatively a 'fresh' one.

The dropper is packed with an interesting packer that disguises the protected executable underneath as a normal code, with the normal flow and innocent API calls. The 'normal' code produced by the protector is designed to fool AV engines: its entropy is fine, so it does not 'ring any bells' within AV heuristics, the APIs it imports are from kernel32.dll and C++ run-time library only, and the algorithm itself displays current time around the world, so no alarms from the AV emulators either:

push offset aPhoenixAzU_s_2 ; "Phoenix AZ (U.S.) : %2d:%02d\n%d"
sub word ptr [ebp-0Ch], 8A51h
call printf
push offset aBeijingMa2d02d ; "Beijing (MA) : %2d:%02d\n"
cmp dword ptr [ebp-20h], 0F001E96Fh
jz loc_4027D9
push offset aCurrentTimeAro ; "Current time around the World:"
call printf
The resources of the dropper indicate it's a 'shell extension library', having a dialog window in its GUI:

The only discrepancy is that while the sample has resources, their size is just 5Kb, while the data section is around 1.2Mb - so what does it carry in its protected luggage?

When run, the dropper allocates a heap memory where it unpacks the code and then jump in it:

From there, it will reconstruct the code in its data section and then pass control back to it:

The code reconstructed in the data section now has an interesting characteristic - instead of a traditional flow, it now assembles the pointers of its functions into a vector. Then, it enumerates the vector and calls each of the function by its pointer. If the function returns FALSE, the code quits. Some functions that it calls contain nested code that also assembles the pointers of other functions into a vector and then calls functions from that vector. Thus, the code flow now reminds a tree where all the nodes located on one level are called subsequently, and all of them have to return TRUE.

For example:

.data:0040A830 mov dword ptr [esi], offset run_query_ANTI_VM
.data:0040A841 mov [esp+20h+var_14], offset _check_usernames
.data:0040A849 cmp eax, esi
.data:0040A84B jnb short next_function_pointer
.data:0040A860 lea eax, [esp+20h+Memory]
.data:0040A864 call add_to_function_vector
.data:0040AB80 call_next:
.data:0040AB80 call dword ptr [ebx] ; CALL function
.data:0040AB82 test eax, eax
.data:0040AB84 jz short exit_loop
.data:0040AB86 add ebx, 4 ; advance index
.data:0040AB89 cmp ebx, esi ; check against the limit
.data:0040AB8B jnz short call_next ; CALL function
The dropper then starts extracting resources from the encrypted stubs of its data section.

First, it extracts resource "affid" of the type of "FILE", and decodes it in a string "540". Next, it extracts "subid" resource of "FILE" type, and decodes it into "direc47". Then it concatenates both strings into "540-direc47".

The string "outlkupd.exe" is then extracted from the resource "name" of type "PAIR".

To make sure there is only instance of the dropper running, the code takes a hard-coded string "ba1039e8cdae53e44ac3e6185b0871f3d031a476" and appends "1010" to it to create a mutex, then appends "1011" to create an event:

  • creates mutex: Global\ba1039e8cdae53e44ac3e6185b0871f3d031a4761010

  • creates event: Global\ba1039e8cdae53e44ac3e6185b0871f3d031a4761011

Following that, the dropper creates a copy of itself under the following names:
  • %TEMP%\outlkupd.exe

  • %TEMP%\[rnd].tmp

where %TEMP% is a temporary directory.

The dropper then checks the version of the operating system, and acts accordingly. If the OS is Windows 2000, Windows XP, Windows Server 2003, or Windows Server 2003 R2, it will run outlkupd.exe.

If the OS is Windows Vista or Windows Server 2008, it will extract resources appverif.exe and vrfcore.dll, then run appverif.exe and wait for 10 seconds before continuing (the user is supposed to accept UAC message during that time).

If the OS is Windows 7, Windows 8, Windows Server 2012 or Windows Server 2008 R2, it will extract resource "stclient.dll" of type PROXY32 as %TEMP%\stclient.dll, if the OS is 32-bit. If the OS is 64-bit (via calling IsWow64Process() on its own process), it will instead extract the resource sqmapi.dll of type PROXY64 as %TEMP%\sqmapi.dll. Next, it constructs a long string that consists of the following lines:


where %SYSTEM% may look like "C:\WINDOWS\System32" on a 32-bit Windows or "C:\WINDOWS\SysWOW64" on a 64-bit OS.

The dropper will then attempt to run a process with the name composed from the lines above, with the CreateProcess() API.

The dropper then attempts to inject the dropped DLL into the launched process and run it there. The injection is implemented by calling CreateRemoteThread() and passing it as a parameter the address of LoadLibrary() API in kernel32.dll, obtained with GetProcAddress(), and a path of the dropped DLL.

.data:0040FAB0 call VirtualAllocEx
.data:0040FAB6 mov edi, eax
.data:0040FAB8 test edi, edi
.data:0040FABA jz short loc_40FA5D
.data:0040FABC push 0 ; lpNumberOfBytesWritten
.data:0040FABE push ebx ; nSize
.data:0040FABF lea ecx, [esp+14h+FileName]
.data:0040FAC3 push ecx ; lpBuffer
.data:0040FAC4 push edi ; lpBaseAddress
.data:0040FAC5 push esi ; hProcess
.data:0040FAC6 call WriteProcessMemory
.data:0040FACC test eax, eax
.data:0040FACE jz short loc_40FA5D
.data:0040FAD0 mov al, 65h
.data:0040FAD2 push offset dword_4261B4 ; lpModuleName
.data:0040FAD7 mov dword_4261B4, 'nrek'
.data:0040FAE1 mov byte_4261B8, al
.data:0040FAE6 mov dword_4261B9, '.23l'
.data:0040FAF0 mov dword_4261BD, 'lld'
.data:0040FAFA call GetModuleHandleA ; get handle of kernel32.dll
.data:0040FB00 push offset dword_4261C4 ; lpProcName
.data:0040FB05 push eax ; hModule
.data:0040FB06 mov dword_4261C4, 'daoL'
.data:0040FB10 mov dword_4261C8, 'rbiL'
.data:0040FB1A mov dword_4261CC, 'Wyra'
.data:0040FB24 mov byte_4261D0, 0
.data:0040FB2B call GetProcAddress ; get ptr to LoadLibrary
.data:0040FB31 mov ecx, eax ; lpStartAddress
.data:0040FB33 mov eax, edi
.data:0040FB35 mov edx, esi ; hProcess
.data:0040FB37 call _CreateRemoteThread ; run LoadLibrary(path to DLL)
Following the steps above, the dropper starts checking if it is running under a virtual machine. To achieve that, it uses Windows Management Instrumentation (WMI) to query a number of key system parameters via WMI query interface.

For example, to see if it is running under Qemu emulator, it runs the following WMI queries:

  • SELECT * FROM Win32_Processor WHERE Name LIKE "%QEMU%"

  • SELECT * FROM Win32_BIOS WHERE Manufacturer LIKE "%QEMU%"

  • SELECT * FROM Win32_DiskDrive WHERE Model LIKE "%QEMU%"

Other performed queries are:
  • SELECT * FROM Win32_Processor WHERE Name LIKE "%Bochs%"

  • SELECT * FROM Win32_DiskDrive WHERE Model LIKE "%Red Hat%"

  • SELECT * FROM Win32_SCSIController WHERE Manufacturer LIKE "%Xen%"

  • SELECT * FROM Win32_ComputerSystem WHERE Manufacturer LIKE "%Parallels%"

  • SELECT * FROM Win32_DiskDrive WHERE Model LIKE "%Virtual HDD%"

  • SELECT * FROM Win32_DiskDrive WHERE Model LIKE "%VMware%"

  • SELECT * FROM Win32_ComputerSystem WHERE Manufacturer LIKE "%Microsoft%"

  • SELECT * FROM Win32_BIOS WHERE Manufacturer LIKE "%innotek%"

  • SELECT * FROM Win32_DiskDrive WHERE Model LIKE "%VBOX%"

Apart from the queries above, the dropper also checks the existence of several pre-defined hashes of the user names, system drivers, OS serial numbers and install dates to see if it is running under a known controlled environment or not, by using the WMI objects:

  • Win32_ProcessName

  • Win32_UserAccountName

  • Win32_SystemDriverName

  • Win32_OperatingSystem (SerialNumber, InstallDate)

If the dropper detects it is running under a sandbox, it quits.

Finally, the dropper extracts a malicious Master Boot Record and Volume Boot Record, and installs them along other components in the system.

In order to extract all the resources from the dropper, the first task is to find out what resource names and types the dropper has. These names can be seen in the memory dump below:

Next, the dropper can be debugged and forcefully directed into the flow where the resource extraction takes place:

Then, the parameters of the function that extracts the resources are patched accordingly:

After the extraction, the ECX will point into the extracted data, as shown below in case of the MBR resource:

By following the trick above, all the other resources can now be extracted from the dropper for further analysis:

  • "BIN"
    • MBR (440 bytes)

    • VBR (512 bytes)

  • "FILE"

    • BOOT (1,515 bytes)

    • DBG32 (6,656 bytes)

    • DBG64 (9,088 bytes)

    • DRV32 (37,888 bytes)

    • DRV64 (42,496 bytes)

    • CMD32 (25,088 bytes)

    • CMD64 (43,520 bytes)

    • LDR32 (6,144 bytes)

    • LDR64 (5,632 bytes)

    • MAIN (3,809 bytes)

    • AFFID - "540"

    • SUBID - "direc47"

    • TDI32 (12,800 bytes)

    • TDI64 (16,384 bytes)

  • "PAIR"

    • NAME - "outlkupd.exe"

    • BUILD (37 bytes)

  • "PROXY32"

    • APPVERIF.EXE (173,504 bytes)

    • VRFCORE.DLL (43,520 bytes)

    • STCLIENT.DLL (241,664 bytes)

    • SQMAPI.DLL (43,520 bytes)

  • "PROXY64"

    • SQMAPI.DLL (49,664 bytes)

Once all the checks were made, and all the resources were installed, along with the infected MBR and VBR, the dropper reboots the system.


Apart from several malicious components, the dropper contains some legitimate resources as well (the PROXY32 and PROXY64 ones). These 'proxy' resources are likely dropped and run in order to trick behavioural analysis systems into believing that the sample in question does not only look nice statically (as explained above), but the files that it drops dynamically are also legitimate, some with the valid digital signatures.

This stealthiness makes the extremely vicious TDL4 also very crafty as it manages to bypass many AV solutions on spot.


The MAIN resource of TDL4 is a text configuration file. Among other sections, it contains a list of the command and control (C&C) servers that are used for communications:


The server names are in base-64 form, but the unpacked strings are encrypted. How to decrypt them? Let's analyse the following resource first.


The 32-bit CMD32 component, as well as its 64-bit counterpart CMD64, is a DLL that is injected the memory of the system process, according to the configuration of TDL4. For example, if the MAIN configuration file specifies the section below, then CMD32 can be found within the system process svchost.exe:


The infected system indeed has svchost.exe with a memory area where CMD32 runs from, along with the mapped configuration file itself.

When CMD32 Dll is run, it checks to make sure it is loaded either into services.exe or svchost.exe. Next, it makes sure that explorer.exe process is also running. If explorer.exe is not running, the code falls asleep for 3 minutes, then checks again.

The Dll then retrieves the image base of its host process and start looking for the APIs called Init() and Uninit(). If found, the pointers of these functions are retrieved and remembered.

Through the static code analisys, it is assumed that the Init() and Uninit() functions are implemented within one of the missing modules that the DLL can download from the command and control servers. The functions are responsible for extracting configuration file (similar to MAIN) from the JPEG images hosted on the blog sites specified in the MAIN configuration file. The JPEG images contain concealed configuration data by using steganography.

The DLL deletes the registry key:


Next, it deletes a file (presumably a legitimate driver file used by MalwareBytes):


Following that, it applies XOR 81 operation to four 256-bit hard-coded seed values. These seeds are then used to initialise four RC4 keys. These keys are then used for encryption/decryption purposes.

The code locates the MAIN section mapped in the process memory and extracts the server list enclosed within [servers_begin] and [servers_end] tags. These names are then unpacked from base-64 and decrypted with RC4 algorithm. Key #3 is used to decrypt the server names below:




At the time of writing, the first domain does not resolve to an IP, the 2nd one resolves to an IP located in Ukraine, the 3rd domain resolves to IP located in Netherlands.

CMD32 then requests a new configuration file from the C&C servers above, by constructing a line with several parameters, as shown below:


The affid/subid parameters are used to identify the client of TDL4 network, the mode parameter clearly specifies that a new configuration is needed.

The parameters above are mixed with the random value parameters in order to avoid cached server responses.

The URL parameters line is encrypted with RC4, by using the key #2, then packed into base-64 format.

Once the configuration file is returned by the server, it is decrypted with the RC4 key #1, then scanned for the presence of the tags [SCRIPT_SIGNATURE_CHECK] and [SCRIPT_SIGNATURE_CHECK_END], then mapped back into the host process memory.

The DLL then requests from the servers an updated module CMD32/CMD64 with the parameter line: (also mixed with the random value parameters:


The line above is then paranoidly mixed with 5 more random value parameters to avoid server response caching.

In order to be able to decrypt configurations from the back-up servers (rogue blog posts), the DLL downloads the COM32/COM64 module with the URL parameter:


The same parameter is used to request additional modules specified in the configuration section enclosed with the tags [modules_begin] and [modules_end].

To request an updated DRV32/DRV64 module, the DLL constructs the line:


If the host processes is terminated, it is immediately restarted with the CMD32 and configuration data immediately re-injected in it.

Replicating TDL4 Encryptor Logic

By replicating TDL4 encryptor (and also decryptor as RC4 is a symmetric algorithm), it is possible to build a tool that allows quick reading of its configuration data - mainly, encrypted C&C servers.

Apart from that, in case there are NXDomains (non-registered domains) specified in TDL4 configuration, it is possible to build a sinkhole that will 'talk nerdy' to the connected victims (do you happen to know what 'GeckaSeka' means?). That is, it will understand their language and talk to them in their own protocol.

Knowing how to encrypt the URL parameters will also allow requesting additional components from C&C (such as COM32 module, required for decrypting configuration data concealed within the blog posts' JPEG images).

Let's build one.

Our decryptor will decrypt the C&C server name specified in the configuration, then we'll encrypt it back to see if the same original string is produced (results 1-2).

Next, we'll decrypt the URL parameter found in the TDL4 traffic below, and try encrypting our own URL parameter that requests the COM32 module from C&C (results 3-4).

The decryptor will first 'prepare' the seeds, then make the following calls:

prepare_seed(seed2); // for config data
prepare_seed(seed3); // for URL parameter

char szResult1[MAX_PATH];
decrypt("7qV2SXF7gv9aKUlN8xMNwdd+nRXbjQ==", szResult1, seed3);

char szResult2[MAX_PATH];
encrypt("", szResult2, seed3);

char szResult3[MAX_PATH];
decrypt("JsVoFkw7PZy+4vo3+ou9d1kWdiq24NpkflIKQDuOUT9+EkJJ2iGaADle3jviKC4VYu/y6B7FyXXOk2EKT...", szResult3, seed2);

char szResult4[MAX_PATH];
encrypt("mode=mod&filename=com32", szResult4, seed2);
where prepare_seed() will XOR the seeds with 81:

void prepare_seed(LPBYTE seed)
for (int i = 0; i < 256; i++)
seed[i] ^= (BYTE)i + 81;
The seeds themselves are hard-coded within the CMD32 binary; they can be copy-pasted from it as:

BYTE seed2[256] =

BYTE seed3[256] =
The RC4 algorithm itself can shamelessly be 'borrowed' from the ZeuS source code:

typedef struct
BYTE state[256];

#define swap_byte(a, b) {swapByte = a; a = b; b = swapByte;}

void rc4_init(const void *binKey, WORD binKeySize, RC4KEY *key)
register BYTE swapByte;
register BYTE index1 = 0, index2 = 0;
LPBYTE state = &key->state[0];
register WORD i;

key->x = 0;
key->y = 0;

for (i = 0; i < 256; i++)
state[i] = i;
for (i = 0; i < 256; i++)
index2 = (((LPBYTE)binKey)[index1] + state[i] + index2) & 0xFF;
swap_byte(state[i], state[index2]);
if (++index1 == binKeySize)
index1 = 0;

void rc4_crypt(void *buffer, DWORD size, RC4KEY *key)
register BYTE swapByte;
register BYTE x = key->x;
register BYTE y = key->y;
LPBYTE state = &key->state[0];

for (register DWORD i = 0; i < size; i++)
x = (x + 1) & 0xFF;
y = (state[x] + y) & 0xFF;
swap_byte(state[x], state[y]);
((LPBYTE)buffer)[i] ^= state[(state[x] + state[y]) & 0xFF];

key->x = x;
key->y = y;
The base-64 packer/unpacker along with the decrypt/encrypt functions can then be declared as:

#pragma comment (lib, "Crypt32.lib")

int ToBase64Crypto(const BYTE* pSrc, int nLenSrc, char* pDst, int nLenDst)
DWORD nLenOut= nLenDst;
BOOL fRet= CryptBinaryToString((const BYTE*)pSrc,
if (!fRet)
return nLenOut;

int FromBase64Crypto(const BYTE* pSrc, int nLenSrc, char* pDst, int nLenDst)
DWORD nLenOut= nLenDst;
BOOL fRet= CryptStringToBinary((LPCSTR)pSrc,
if (!fRet)
return nLenOut;

void decrypt(char *szBase64, char *szResult, LPBYTE seed)
memset(szResult, 0, MAX_PATH);
int iResultLength = FromBase64Crypto((LPBYTE)szBase64,

RC4KEY rc4_key;
rc4_init(seed, 256, &rc4_key);
rc4_crypt((LPBYTE)szResult, iResultLength, &rc4_key);


void encrypt(char *szStringToEncrypt, char *szResult, LPBYTE seed)
char szTemp[MAX_PATH];
strcpy_s(szTemp, MAX_PATH, szStringToEncrypt);

RC4KEY rc4_key;
rc4_init(seed, 256, &rc4_key);
rc4_crypt((LPBYTE)szTemp, strlen(szTemp), &rc4_key);

memset(szResult, 0, MAX_PATH);
int iResultLength = ToBase64Crypto((LPBYTE)szTemp,

Running the tool produces the expected results:

Update: the string ba1039e8cdae53e44ac3e6185b0871f3d031a476 is not hard-coded, it's a SHA-1 hash derived from the system footprint (Windows OS, product ID); thanks to the user 0x16/7ton for correction!