Thursday, 5 September 2013

The Merchant of Venice Marches on Italy

As if to prove its name, the latest variant of Shylock has now extended its geography to cover Italian banks. Quite an ironic twist, isn't it?

Armed with an improved protection layer, it is now harder to detect too, fetching only 2 detections out of 45.

The anti-VM tricks employed by Shylock can fortunately be defeated by StrongOD, a handy plugin for OllyDbg. So let's roll up our sleeves and give it a closer look.

The previous post on Shylock has provided on overview of its operation and its encryption schemes. What we aim this time is to actually try to reconstruct the entire encryption/decryption algorithm in a stand-alone tool, a tool that will allow us to fetch and then decrypt Shylock configuration files along with the so called 'Inject Packs'.

Shylock configuration files normally enlist current command-and-control (C&C) servers along with the location of the 'Inject Packs' - larger configurations that define browser injection logic, that is, what banks to target and how. By downloading and decrypting configuration files from the known live C&C servers, we'll be able to find out what the newly registered C&C are. By fetching 'Inject Packs' from these servers, we'll know what new tricks are implanted there and what new regions are being targeted.

The C&C domains that we thus detect will be handy in monitoring the traffic. Since all communications are SSL, they can't be sniffed, but the presence of the Shylock domains in the traffic is a sure sign of 'Houston, we have a problem'.

The sample we've analysed contains a built-in configuration stub that enlists 3 hard-coded C&C servers followed by 2 backup C&C servers:

  • uphebuch.su

  • oonucoog.cc

  • ahthuvuz.cc

  • wsysinfonet.su

  • statinfo.cc

The first 3 C&C servers are now down but the back-up ones point to the same IP (217.172.170.220) in Germany.

Sending it a packet encrypted the same way as we did last time no longer works - the server returns us a string which is our IP address. So clearly something has changed. To find out what was changed, we'll need to reconstruct the entire communication logic of Shylock step-by-step, by combining dynamic and static analysis of the sample. For that, we firstly dumped the memory heap pages where the Shylock executable has unpacked itself. Next, we decrypted all the strings in that dump (753 strings), and built a table of all hashes of all APIs from all modules loaded by Shylock (28,500 hashes). After that, we were able to reverse engineer its new logic, and this is what we've found:

The Shylock request now needs to be submitted via 'POST'. In addition, the C&C server now requires that the User-Agent header provided be formatted as:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.<NUMBER>)

Where <NUMBER> is a 4-digit number composed of the numbers collected from the bot ID string, from left to right.

For example, if the bot ID was "6A3B21C...", then the <NUMBER> field in the User-Agent string above should be "6321". If it's not, the server replies with an IP address of the connected client (our own IP), an indication that the server has rejected the connection as unauthorised.

Fetching the configuration file from statinfo.cc returns a new list of C&C servers:
  • eevootii.su

  • queiries.su

  • wahemah.cc

The geography of the destination IP addresses is also wider (US, Netherlands, Ukraine):
The 'Inject Pack' lives at /files/hidden7710777.jpg. This location is different from the one hard-coded within the malware mini-stub: /files/hidden7770777.jpg, so let's fetch both.

The downloaded file is an encrypted/compressed binary file that starts from the signature 0x11223344. Following the signature, the next DWORD specifies if the file is encrypted (flag 1), and/or compressed (flag 2). In our case the file is both encrypted and compressed, as the 4-byte field is set to 0x00000003 (1 + 2).

A WORD at the file offset 0x0C contains a checksum of the file (we won't replicate the hash calculation logic as we trust the file we download is authentic and not corrupted), and the next DWORD specifies an encryption key that the file is encrypted with.

The decryption function can be reconstructed as:
unsigned int DecodeBuffer(LPDWORD lpdwKey, int abyBuffer, unsigned int dwSize)
{
   unsigned int i = 0;
   unsigned int result;

   if (abyBuffer && dwSize > 0)
   {
      do
      {
         *(BYTE *)(i + abyBuffer) ^= *(BYTE *)lpdwKey;
         result = (845 * *(DWORD *)lpdwKey + 577) / 0xFFFFFFFFu;
         ++i;
         *(DWORD *)lpdwKey = (845 * *(DWORD *)lpdwKey + 577) % 0xFFFFFFFFu;
      }
      while (i < dwSize);
   }
   return result;
}

The actual encrypted bytes start from the file offset 0x1a within the file.

After the 'Inject Pack' is decrypted, it is then uncompressed with zlib v1.2.3 algorithm. Shylock used the source code of zlib as we see a 100% match with the zlib open source project. One fast way to uncompress the decrypted file at this point is to save the decrypted buffer as a .gz file, and then uncompress it with the 7-Zip utility.

The decompressed 'Inject Pack' has a binary header that specifies other text files underneath, such as az_sooba.txt, az.txt, cc.txt, chat_chagas.txt, chat_phone_replace.txt, chat_sooba.txt, but at this point it is perfectly readable with a text editor.

The inclusion of a number of Italian banks in its logic does not look good.

Putting it all together

The entire encryption/decryption logic of Shylock used during its communication with the command-and-control server was fully reverse-engineered and then closely replicated in a stand-alone tool.

We are releasing the tool along with its source code in the hope that it will help researchers to query Shylock's command-and-control servers both for configuration files and for 'Inject Packs', in order to learn what new servers are being added, and what new banks are being targeted. We are hoping that such early discovery will help both security researchers and the banks to be better prepared for the new tricks that must surely be up Shylock's sleeves. Early identification of new C&C domains will also help network administrators to detect Shylock traffic within their networks and act to block access from any infected hosts.