Decoding Ebury Malware SSH Commands

Summary

The Ebury malware family plays a role in a much larger multi-malware family operation that has been going on for over 10 years. The ESET company has over a decade of research into this and has code named it "Operation Windigo". This involves multiple malware families: Linux/Ebury, Linux/Corked, and Perl/Calfbot.

At its core, the endgame is to steal credentials via compromising and rootkitting SSH servers and stealing cryptocurrency wallets, credentials, and credit card details.

There are many published whitepapers and blog articles documenting the myriads of moving parts that make up this highly sophisticated infrastructure. My focus on Ebury was prompted by ESET Research’s latest whitepaper being published entitled “Ebury is alive but unseen: 400k Linux servers compromised for cryptocurrency theft and financial gain”.

References

To learn more about Ebury, you can read several of ESET’s posts from their research that I have linked below:

The big picture - What is Ebury all about?

The Ebury malware is an OpenSSH backdoor that is implanted on compromised SSH servers as a shared library to take control of the SSH server. This allows the adversary to maintain access to the server and capture credentials for all authentications to that server and any outgoing SSH authentications perform from the server.

source: https://web-assets.esetstatic.com/wls/en/papers/white-papers/ebury-is-alive-but-unseen.pdf

The focus of this post

The focus of this particular analysis was in how Ebury is decoding the hexidecimal values that the Ebury operator sends to the infected SSH servers.

Two examples of what this would look like in an SSH request on the wire would be:

SSH-2.0-e549a94da47eeb7ta02ebf50f27eaa (earlier than version 1.7)
SSH-2.0-HZs1/gwk9n2hou48Z280Hg== (version 1.7+)

My goal was to understand how this value is decoded by using the earlier version (note: this value is the command sent to the infected server). This is also something that the ESET team did not seem to provide a way to decode in their published papers.

BLUF

Ebury uses operator source IP addresses as the seed to create a XOR key to decrypt the C2 commands
Resolving of encrypted strings within the Ebury malware is understood
Creation of a decoder for the encoded command strings (for hex variant) is performed

Statically decoding strings for the 1.6 version of the Ebury shared library

I moved to looking at the older 1.6 version before they switched obfuscation techniques that involve decrypting Elf64_Dyn structures at runtime. This version instead used a multi-XOR key decoding method to decrypt strings and then used them with dlsym to dynamically resolve functions.

Sample

Ebury_v1_6_2fp.zip (MD5: 470A3F33603DDFFFE3FE21488A2AA5D2)

In this version we can quickly locate where the string decryption starts by following the very beginning of the initialization. We track down what code is executed upon loading of the shared library.

Upon analyzing this I wrote some IDA Python code to XOR decode the memory addresses specified by the bytes_start and bytes_end variables that I marked up above.

  
from ida_bytes import get_dword, patch_dword
from idautils import *
from ida_idaapi import *

xorkey1 = 0x151AC5FA
xorkey2 = 0xF9E1DCDD

bytes_start = 0x207898  # Starting address
bytes_end = 0x208468  # Ending address

current_address = bytes_start

while current_address < bytes_end:
    # Read current and next dword from the address
    current_dword = get_dword(current_address)
    
    # XOR with the first key and patch the memory
    patched_dword = current_dword ^ xorkey1
    patch_dword(current_address, patched_dword)
    
    # Move to the next dword and check if we reached the end
    current_address += 4
    if current_address >= bytes_end:
        break
    
    # Read next dword for XOR operation with the second key
    next_dword = get_dword(current_address)
    
    # XOR with the second key and patch the memory
    patched_next_dword = next_dword ^ xorkey2
    patch_dword(current_address, patched_next_dword)
    
    # Move to the next dword for the next iteration
    current_address += 4

print('XOR decoding completed.')

This decoded hex bytes in place within the IDA Pro database to reveal the hidden strings to allow us to dig deeper into the binary.

Before

After

How is the hex value decoded?

The example value used in this case is SSH-2.0-e549a94da47eeb7fa02ebf50f27eaa and the Ebury operator IP of 141.255.166[.]187 (found within ESET’s 2024 whitepaper as a current operator IP).

Now that the strings have been resolved, I located the SSH client version string pattern to further understand how this is being handled by the backdoor.

The backdoor is doing two things here:

It will capture the version string presented by the client (variable s)
It will capture the source IP address of the client that is connecting to the infected server

The IP address of the connecting client is used and manipulated to be the decryption key. The hex bytes we see in the SSH version string, once decoded, will reveal what C2 command they are executing against the infected server (instructions to the running backdoor).

Decoding script

After analyzing the assembly, I was able to reproduce the way the backdoor performs the decryption and the script I wrote is below.

  
import socket
import struct

def add_to_byte(number, byte_index, add_value):
    # Isolate the byte to be modified
    byte_to_modify = (number >> (byte_index * 8)) & 0xFF
    
    # Add the value to the byte
    modified_byte = (byte_to_modify + add_value) & 0xFF
    
    # Clear the original byte in the number and insert the modified byte
    number = number & ~(0xFF << (byte_index * 8))  # Clear the byte
    number = number | (modified_byte << (byte_index * 8))  # Insert the modified byte
    
    return number

def reverse_hex_bytes(hex_number):
    # Convert to hex string, remove the '0x' prefix, and ensure it's padded to represent full bytes
    hex_str = hex(hex_number)[2:].zfill(8)
    
    # Reverse the hex string by bytes (2 characters represent 1 byte in hex)
    reversed_hex_str = ''.join(reversed([hex_str[i:i+2] for i in range(0, len(hex_str), 2)]))
    
    # Convert the reversed hex string back to a number
    return int(reversed_hex_str, 16)

def decode(ip_address, data):
    binary_ip = socket.inet_aton(ip_address)
    numerical_value = struct.unpack("!I", binary_ip)[0]

    # swap endianness
    key = struct.unpack("<I", struct.pack(">I", numerical_value))[0]
    key ^= 0x1010101

    # Apply the operations to each byte
    key = add_to_byte(key, 0, 5)  # Add 5 to the least significant byte
    key = add_to_byte(key, 1, 33)  # Add 33 to the next byte
    key = add_to_byte(key, 2, 55)  # Add 55 to the next byte
    key = add_to_byte(key, 3, 78)  # Add 78 to the most significant byte
    key = reverse_hex_bytes(key)
    
    key_bytes = bytes.fromhex(hex(key)[2:])
    data_bytes = bytes.fromhex(data)
    decoded_bytes = bytes([data ^ key for data, key in zip(data_bytes, key_bytes * (len(data_bytes) // len(key_bytes) + 1))])
    
    return decoded_bytes

# decode the SSH client string
result = decode('141.255.166.187', 'e549a94da47eeb7fa02ebf50f27eaa')
print(result.decode('utf-8'))


OUTPUT:
tVwE5a5w11aXcat

With this information, we can now reveal what instructions the Ebury operators are invoking against victims. The command seen above is Xcat and we can look at a known table of commands below to know what operation it is performing.

The base64 variant

The 1.7+ versions of Ebury added more complexity to the backdoor shared library. They decode a table of Elf64_Dyn structures which will need to be reversed and then we can look at how the base64 is decoded and make a matching decoder. What has been learned from the version 1.6 sample here will be compared to what is in version 1.7 so I can decode both variants. Further, we would like to target the DGA algorithm used by the backdoor.

Challenges

Through this process I developed some tooling to better understand what an Ebury operator is attempting to perform against a victim (e.g. checking Ebury versions, extracting credentials, etc.). The progress here also leads us closer to understanding the DGA algorithm that could be used to monitor all possible domains that may get registered.

This could allow us to identify victims if we see traffic reaching out to them. We could also potentially find a way to send a specially crafted request to an SSH server to illicit a response that indicates whether it has been infected or not (in the event we create an “Ebury scanner” of sorts).

The hardest problem to crack continues to be the identification of Ebury operator IP addresses. Finding active ones is very difficult and requires control and visibility into traffic of infected SSH servers (where I suspect the ESET team has put some work over the years to setup and lure in or otherwise have access to compromised servers).