44CON 2014 - Meterpreter Internals, OJ Reeves
Everyone has heard of Metasploit, the Open Source exploitation framework, and most have probably come into contact with it on the attacking and/or receiving end. Meterpreter, Metasploit’s most frequently used payload for Windows systems, enables a tester who has gained control of one machine to perform further exploitation, pivoting and penetration with relative ease. But how does Meterpreter work? What goes on ‘under the hood’ when certain commands are executed? How does it avoid touching the disk and survive happily in memory? How does it hide from the operating system, and how could you locate it if it’s running? Let’s dive into the plumbing that makes Meterpreter tick. I will explain in relative detail its lifecycle, along with some of the ins and outs of topics such as Reflective DLL Injection and Migration. Bring your low-level knowledge and interest in technical details as we pop the hood of one of the most loved parts of Metasploit.
2. Goals
Dispel some Meterpreter myths …
… expose the innards …
… encourage you to dive in!
3. Agenda
• What is Meterpreter?
o VERY brief overview and history
• What is it made of?
o Components, code, communications
• How does migration work?
• Questions
4. What is Meterpreter?
Shells are great, but we need more …
… enter the “Meta-Interpreter” …
… a payload, RAT, and post-exploitation tool.
5. What is Meterpreter?
• Multi-platform
o POSIX, Win32, Win64, Python, PHP, Java,
Android … OSX!
• Forensics “friendly”
o In memory
o Encrypted communications
• Much more control
o Stacks of commands
o Dynamically loadable extensions
o Post modules
6. It’s huge!
Can’t possibly cover it all …
… implementations very across platforms …
… we’ll focus on Windows x86 Native Meterpreter.
19. Stage Construction
• Load metsrv.x86.dll from disk
• Generate a bootstrapper
• Patch metsrv:
o Bootstrapper DOS header
o Comms config (for http/https)
20.
21.
22. Reflective DLL Injection
• Stephen Fewer (legend!)
o Harmony Security
• Mini PE loader
• No host process registration
o Sorry sysinternals!
• Doesn’t touch disk
• Slightly adjusted in MSF
o “Asks” not to paged to disk
o Extra attach/detach
23. RDI Steps
1. Locate the image in memory
2. Find helpful libraries/functions
o Needed to do more work
3. Prepare memory for new image
4. Process sections
5. Process imported libs/functions
6. Process relocations
7. Call DllMain()
30. Why hash?
• Can’t put strings in PIC
o We don’t know where we are, we don’t
where the strings are either
• Strings bloat payload size
o Not as important here, but it is elsewhere
• Contain NULLs
o Not important here, but important elsewhere
• Consistent with block_api (later)
41. Step 5
Handle the lack of PIC support
and support relocations
42. Relocation
• For each relocation block entry …
• … for each relocation entry in the block …
• … figure out the relocation offset …
• … patch in the library address value:
o Add DWORD
o Add HIWORD
o Add LOWORD
47. Metsrv Startup
Finally… DllMain!
• Server thread created
• Comms taken over & encrypted
• Scheduler initialised
• Dispatch loop executes
48.
49. Not quite!
But we’re really close!
Metsrv is running, but we
have no commands!
50.
51.
52. stdapi and priv
• Extensions to meterpreter
• Stdapi provides the “guts”
o Execution, shells, uploads/downloads, etc
• Priv gives us the ability to elevate
o Getsystem
• Both immediately uploaded &
reflectively loaded
57. Yes!
We have a fully functional
Meterpreter session!
58. How does it feel?
http://securityreactions.tumblr.com/post/93792005074/how-i-felt-when-i-got-my-first-meterpreter-
session
59. Migration
• My favourite feature
• “Jumping” across process boundaries
• Doesn’t drop connectivity
• Helps avoid process that:
o Are likely to crash
o Are likely to be closed
• Helps maintain sessions!
60.
61.
62.
63.
64.
65.
66.
67.
68.
69. Migration in Metasploit
1. Check process exists, isn’t “me” and
we have permissions to touch
2. Get target process architecture
3. Generate a new migration payload
4. Send command to Meterpreter
5. Wait for migration to finish
6. Reload previously loaded extensions
74. Type, Length, Value
• Type – actually both type and identifier
o String, integer, binary, etc
o ID which says “which integer” (eg. PID)
• Length – size of the data
o Integer – 4 bytes
o String – ASCII string length
• Value – the data itself
o Byte blog of “Length” bytes
• Packet = Header + TLV + TLV + TLV …
77. Migration in Meterpeter
1. Read all the data from the TLVs
2. Create synchronisation primitive
3. Prepare the target process memory
4. Hand over control
o Thread creation/hijacking and RDI
5. Shut down current Meterpreter
78.
79. Migrate Context
Used for synchronisation
Force 8-byte size
Pointer to metsrv payload
Duplicated socket info
91. Migration Completes!
• The RDI stub is invoked
• Metsrv is reflectively loaded
• The rest is history …
92. The “links” Slide
• https://github.com/rapid7/meterpreter
• https://github.com/rapid7/metasploit-framework
• http://buffered.io/
• #metasploit on Freenode
• http://rapid7.com/ (No, I don’t work for them!)
• http://beyondbinary.io/
I look forward to your PRs!
The native Meterpreter is like it’s the ugly stepchild of MSF, when in fact everyone knows that’s the Java version. It’s not that bad and it’s getting better.
I’m going to dive into some of the innards (probably the “good” bits to avoid putting you off) and show you that it could be a whole lot worse.
Point you at the repo, hope you feeling like contributing.
I guarantee that you’ll be sick of Leonard Di Caprio’s face by the time we’re done.
Most people know what it is and where it came from so I’ll keep this nice and brief.
We’ll see the major bits and pieces of what it is, how it lives and dies and what the comms is like with MSF
Dive into migration (fav feature). I wanted to cover others, but I just don’t have the time to do it.
Hopefully time for lots of questions at the end.
We all love shells. Shells are awesome. However, while awesome, they’re often not enough. It’s possible to do an immense number of things with standard shells, but it can be cumbersome, slow and error-prone. We want the ability to hide in memory, run inside other processes instead of launching a /bin/sh or cmd.exe. We want to add extensibility. We want the means to do post exploitation tasks more easily without the manual steps of copy and pasting data/files/etc. We want easy priv esc. We want persistence. We want pivoting, routing, port forwarding, and lots lots more.
This is where Meterpreter, the “Meta interpreter” comes in. It aims to solve these problems while being as forensically silent as it can be. (except when it’s not).
Meterpreter comes in a stack of flavours. <list the ones above> OSX is coming soon, we’re just ironing out the last of the issues with the contribution and we should see it land in master hopefully in the coming weeks.
Meterpreter does its best to avoid stuff like writing to disk, or doing communications in the clear. It tries to live in memory and takes steps to prevent itself from being seen, which we will dive into a bit later on.
This thing lets you do all those things we want, and more. There are a plethora of commands available, plenty of very handy extensions and bucket loads of post modules that make use of a meterpreter session.
The examples we’ll see in this talk will all revolve around X86 native meterpreter on Windows. The reasons:
ETOOMANYMETERPRETERS
X86 is simpler to read and talk about
The most feature-rich
It might have something to do with me not knowing much about the others
Most things that I talk about translate directly to x64 land, just with different shellcode/binaries.
Some stuff maps to POSIX.
As far as function goes, the other Meterpreters provide a subset of functionality but are very different beasts.
C and C++ make up the core of the Windows meterpreter binary. Posix contains just C code at the moment though there’s nothing stopping that from changing down the track.
ASM lives in MSF and in Meterpeter. It’s used in Migration stubs, thread transition stubs and in the payloads (such as reverse_tcp).
To be fair, MSF is the Ruby part. There’s no Ruby in Meterpreter. However, given that we’re looking at the entire Meterpreter ecosystem it’s worth mentioning ruby
To aid in understanding I didn’t just want to throw code in your faces without having any context. For the next few slides we’ll be walking through a scenario that we are all probably familiar with, and that’s getting a Meterpreter session using a reverse_tcp payload. Along the way, we’ll dig into the process flow and some bits of code to see how it all fits together and how the individual pieces work.
We start with the stock standard configuration for a reverse_tcp payload inside the much loved MS08-067. I felt ok running with this exploit because the focus here is on Meterpreter and not exploitation. If I had been here talking about exploiting stuff in 2014 there’s no way I’d have had the gall to use this exploit.
Once our exploit is set up and ready to go…
We kick it off. The first thing that happens is that MSF launches a listener on the IP:port that we configured in our payload. This is obviously in place to wait for the connect-back that will happen when our exploit is successful.
The next part shows this exploit fingerprinting the target service so that it can select the right shellcode. This doesn’t necessarily happen with every exploit though.
This is where the fun starts. MSF has generated a full payload and has sent it to the target and is sitting here waiting for a connect back.
To understand where we’re at, and what is about to happen..
With a meterpreter payload, the stage that it sent to the target is a copy of metsrv.x86.dll. This component is the “core” of meterpreter. It provides a minimal shell of functionality that has a few simple goals: get loaded into memory without hitting disk, take over communication with MSF, and encrypt all communications from this point onwards.
The first thing to do is load metsrv from disk into memory. Once loaded, the binary is scanned for a function called “ReflectiveLoader” (which we’ll go into soon) and the offset to this function is retained.
A bootstrapper is then generated. This bootstrapper is responsible for making sure that metsrv kicks off properly.
Metsrv is then patched in two ways. The first is the DOS header is overwritten with the bootstrapper code. Then, in the case of HTTP and HTTPS paylaods, details of the callback URI are updated as well. Given that we’re using tcp, this doesn’t apply in our scenario.
The bootstrapper code is actually quite simple and is worth looking at. Here’s what it looks like in ASM (I thought I’d save you, and me, the pain of hex).
The first thing the bootstrapper does is figure out where it lives in memory so that it has a reference point from which to call/jump.
The next thing it does is locates the address of the RelfectiveLoader function based on the current location in memory and the offset that was specified via MSF.
Here we can see question marks, however this value will have to been modified by MSF prior to being sent to the target. Once that location has been calculated, the function is invoked. But what is this ReflectiveLoader function and what does it do?
ReflectiveLoader is a function that is part of the Reflective DLL Injection functionality invented by Stephen Fewer.
MSF has it’s own fork.
RDI needs to make the current “image” a functioning one. That means effectively loading the PE manually. This means that it needs to:
Find out where the image actually lives. Without this we have no base in which to start.
Then locate some libraries and functions that will already be loaded into the address space of the currently running process. Without these functions we can’t really do much at all.
Move the image to a new area of memory which is RWX, big enough for each section’s requirements. Copy over image headers, etc.
Then wire those sections into the right locations.
Find all the require libraries and functions and wire them into the IAT.
Go through the image and fix up relocations. PE files don’t have PIC, instead they have a preferred base and fixed function calls based on that. Windows will fix up these addresses when the image is loaded in the case where the image isn’t loaded at the preferred address. Given that the OS isn’t involved here, RDI needs to do this manually.
Finally, we invoke the DLL entry point to kick it off.
Brace yourself, you’re about to be bombarded with C.
I won’t go line by line here, but I will go block by block. RDI is pretty amazing and is worth a good look.
Brace yourself, you’re about to be bombarded with C.
I won’t go line by line here, but I will go block by block. RDI is pretty amazing and is worth a good look.
It all starts with a call to an intrinsic called _ReturnAddress, which returns the value of the address that the program counter will return to once the ReflectiveLoader finishes. This is wrapped up in a helper macro called caller().
From here the code iterates backwards through memory, looking first for a DOS signature, and then based on that location searching for a valid NT signature. If both of these values are found, the code assumes the current base library address value points to the correct image base.
To get access to functions that are already loaded, such as this in Kernel32, we need to jump directly to the process’ PEB. We can find this at …
offset 0x30 from the File Segment register. Inside the PEB we can locate the loaded modules from the PEB_LDR_DATA structure which is located at a known offset from the start of the PEB. This contains a list of modules currently loaded into the process. From this point we can start iterating through them, searching for modules we’re interested in..
The first part is just a preamble that prepares for hashing the module name. We get a pointer to the string and a count of character.
The second is a simple hashing algorithm that calculates the hash. The “ror” macro is a helper for an intrinsic function call that rotates a DWORD (or QWORD on x64) value right by 13 bits. My understanding is that this algorithm, while simple, doesn’t result in module name has clashes for “important” DLLs such as kernel32, ntdll, user32 etc. nor does it result in a clash for functions within those modules.
Hashing is way better than using inline strings for a few reasons.
Position Independent Code doesn’t like strings. Strings appear in the .data section, which means we would need to locate it, we’d also need to figure out if we have the right string in the right context. How do we do that? Probably by hashing or something siimilar.. But wait, this is what we’re doing!
Strings are huge.
Strings need NULL termination.
The hashing is consistent with block API. But the biggest point really is the first one where we just want to avoid the extra pain of finding the strings in memory before being able to use them.
The process we’re about to see here is done for NTDLL as well, but we won’t dive into that because it’s just a repeat of this. We check to see if the function’s hash is the same as the pre-determined one for KERNEL32..
.. If it is, we prepare to start parsing the module to find functions within it. We need access to the table of functions exported via the DataDirectory block, so that we can then use it to figure out the location of the array of function names in memory (uiNameArray) as well as the array of function ordinals (uiNameOrdinals).
Why “if” instead of “switch”? Because if we did we’d end up with a jump table, which isn’t PIC!
As part of the loop over the function names..
The first step involves the calculation of the hash of the function name. Here we use an inline function which calculates hashes for NULL-terminated strings. We can use this function here but not on the module name because the module name is a unicode string, and not a NULL-terminated ASCII string.
If the function hash is one that we’re interested in (list has been shortened for clarity)…
.. We figure out the location of the function pointer array for this module, and then index into that array to find the function pointer for the function name that we have just hashed. Then…
.. Based on the function we’ve found, we cast that function to the appropriate type and store that in a local var for later use. The two examples here show LoadLibrary and GetProcAddress getting loaded. There are others that are required as well.
The current image needs to be properly loaded as a functional PE, which means it needs “room to breathe”. As a result, we need to copy it to a new location which has room for the sections.
First we reference to start of the header for this image.
We then allocate enough memory to store the require image size (not just the number of bytes that make up the image). This will allow us to set up the sections with the space that they need. We also make sure the memory is RWX so that we don’t have any issues.
Next we do something that the original implementation of RDI doesn’t do, and that is “request” that the OS doesn’t page our new area of memory to disk in an effort to avoid appearing in the pagefile during forensic examination.
The next step involves just doing a byte-by-byte copy of the headers over to the new location.
The current image needs to be properly loaded as a functional PE, which means it needs “room to breathe”. As a result, we need to copy it to a new location which has room for the sections.
We’ll go over this quickly coz it’s pretty standard stuff… we get a pointer to the start of the first section…
A count of sections,.. And then for section…
.. Calculate offsets of both source and destination based on the section information, and byte-by-byte copy to the new location.
This caters for the cases where the sections have more memory on disk than the size of the section on disk.
The module will no doubt have a bunch of imports, so we need to wire those up ourselves otherwise calling those functions will result in crashes.
Again using the image header, we get access to the EAT…
… and then offset from there to get the virtual address of the import descriptor.
For each import, we attempt to use the LoadLibrary function pointer to load the module into memory based on the name of the import.
If that doesn’t work, we skip to the next one.
Stay with me, we’re nearly there!
For each import we check to see if import contains a name. If not (ie. The IMAGE_ORDINAL_FLAG is set) then we…
… do some pointer jiggery pokery to figure out where the function lies and then
… patch that address directly into the import location.
If the name is present, then we can easily get the IMAGE_IMPORT_BY_NAME structure…
.. And use GetProcAddress to load it by name, patching directly into the import.
Done!
Relocations now need to be updated, as there’s a pretty remote chance that we managed to load the binary at the “preferred” address.
There’s quite a bit of code in here. Too much to show on slides.. So instead.. This time I’m just going to summarise the steps.
Relocations now need to be updated, as there’s a pretty remote chance that we managed to load the binary at the “preferred” address.
There’s quite a bit of code in here. Too much to show on slides.. So instead.. This time I’m just going to summarise the steps.
We’ve finally done the hard work, the last thing we need to do is actually invoke the library…
If the name is present, then we can easily get the IMAGE_IMPORT_BY_NAME structure…
.. And use GetProcAddress to load it by name, patching directly into the import.
Done!
So, our DLL has finally be loaded, patched, relocated, and invoked. It’s actually running. Once this call has finished, the return value is a pointer to DllMain, and that’s returned in EAX.
So the first thing we do is store this value in EBX to make sure that nothing overwrites it from here.
We then kick off the metasploit specific functionality passing in the socket handle that has been used for communications this far. This call to DllMain is a BLOCKING call, and it will not return until the underlying function returns. A LOT more stuff happens in this call, and yes, we WILL dive into this a bit. Upon return, this is an indicator that the application should be shut down, and so …
DllMain is called again passing in the value for …
ExitFunc. This Value is populated by MSF prior to the payload being sent to the target. This value can be PROCESS/THREAD/SEH and often defaults to PROCESS.
So when making the first call to DllMain with DLL_METASPLOIT_ATTACH, metsrv actually kicks in…
DllMain is finally invoked on a functioning DLL. In metserv this means that real C code that isn’t PIC can execute as per normal.
Metserv begins by kicking off a new thread. This thread is responsible for handling commands that are sent to it from MSF. Once that thread is up and running…
… communications is then taken over (by that I mean the socket handle becomes managed by metserv, not shellcode) and SSL is initialised. With the SSL handshake out of the way, all comms between MSF and Meterpreter are now encrypted with SSL. The OpenSSL version has been updated recently by Todb, so no heartbleeding!
The Meterpeter scheduler is the set up and initialised. The scheduler is rsponsible for handling multithreaded or async tasks,such as interactive shells.
Finally the dispatch loop executes. In the case of reverse_tcp, this is what listens on the socket for messages coming from MSF. When a message is received, it checks to see whether it should be invoked directly on the dispatch thread (which is required in the case of some important commands such as shutdown and migrate), or whether it should be invoked on the server thread. Once it’s taken appropriate action and the command is executed/handled, the loop continues, and meterpreter waits for the next command.
Other bits and pieces are done that a little non-descript so I’m just going to ignore them for now. Code for this isn’t particularly interesting so again we’re going to skip this for brevity.
But … METERPRETER IS RUNNING!
DllMain is finally invoked on a functioning DLL. In metserv this means that real C code that isn’t PIC can execute as per normal.
Metserv begins by kicking off a new thread. This thread is responsible for handling commands that are sent to it from MSF. Once that thread is up and running…
… communications is then taken over (by that I mean the socket handle becomes managed by metserv, not shellcode) and SSL is initialised. With the SSL handshake out of the way, all comms between MSF and Meterpreter are now encrypted with SSL. The OpenSSL version has been updated recently by Todb, so no heartbleeding!
The Meterpeter scheduler is the set up and initialised. The scheduler is rsponsible for handling multithreaded or async tasks,such as interactive shells.
Finally the dispatch loop executes. In the case of reverse_tcp, this is what listens on the socket for messages coming from MSF. When a message is received, it checks to see whether it should be invoked directly on the dispatch thread (which is required in the case of some important commands such as shutdown and migrate), or whether it should be invoked on the server thread. Once it’s taken appropriate action and the command is executed/handled, the loop continues, and meterpreter waits for the next command.
Other bits and pieces are done that a little non-descript so I’m just going to ignore them for now. Code for this isn’t particularly interesting so again we’re going to skip this for brevity.
But … METERPRETER IS RUNNING!
When metserv is running MSF tells us that we have a session ready go to. Yet, this is not quite true. While the session has been established, metserv hasn’t actually got any commands registered that are useful to the user.
The reason for this is because metsrv is intended to be as small as possible so that secure communications can be established as quickly as possible. From there, extra functionality is pulled in over an encrypted socket and that is wired in.
So, let’s take a look at how this happens.
DllMain is finally invoked on a functioning DLL. In metserv this means that real C code that isn’t PIC can execute as per normal.
Metserv begins by kicking off a new thread. This thread is responsible for handling commands that are sent to it from MSF. Once that thread is up and running…
… communications is then taken over (by that I mean the socket handle becomes managed by metserv, not shellcode) and SSL is initialised. With the SSL handshake out of the way, all comms between MSF and Meterpreter are now encrypted with SSL. The OpenSSL version has been updated recently by Todb, so no heartbleeding!
The Meterpeter scheduler is the set up and initialised. The scheduler is rsponsible for handling multithreaded or async tasks,such as interactive shells.
Finally the dispatch loop executes. In the case of reverse_tcp, this is what listens on the socket for messages coming from MSF. When a message is received, it checks to see whether it should be invoked directly on the dispatch thread (which is required in the case of some important commands such as shutdown and migrate), or whether it should be invoked on the server thread. Once it’s taken appropriate action and the command is executed/handled, the loop continues, and meterpreter waits for the next command.
Other bits and pieces are done that a little non-descript so I’m just going to ignore them for now. Code for this isn’t particularly interesting so again we’re going to skip this for brevity.
But … METERPRETER IS RUNNING!
Each extension needs to define the set of commands that it supports…
Functions are registered with this COMMAND_REQ macro, where the name/id of the message/command is associated with the function that handles the request.
When an extension is loaded, metsrv invokes the InitServerExtension method. In most cases, the extensions simply register all the custom commands that they support with the host metsrv instance.
The command_register_all function takes that list of commands and adds each of them to an internal list of extended commands. This list is the one that is referenced when messages appear from MSF. The message identifier is checked against the registered commands, and if found, the associated function pointer is called so that it can handle that command.
Thanks for sticking with me. I hope that’s given you a good idea of the kind of magic that goes on behind the scenes just to get a functioning Meterpreter session.
But we’re not stopping here!
I love migration. It’s amazing. I first stumbled on it during my time in the offensive security PWK Labs and was rather blown away by the idea that something could leap across process boundaries without severing the connection. Months later I dived into the source and was taken back by the simple genius of it.
This is why I want to share the details of it with it. I think it’s a feature that we all love, but one that many don’t necessarily understand. Windows is currently the only platform that migration is supported on, however the unrelenting brains of Juan Vazquez (Rapid7) have come through with a method of migrating on Linux using some ptrace trickery. That’s yet to land in the master repo, but watch this space as it should be in there soon.
OK, so what does it do. As I mentioned it allows movement across processes boundaries …
… without losing the connectivity. Why this is useful? Well it..
… allows you to avoid certain conditions that might result in a lost session. For example, you might have phished a user and exploited their browser. Exploitation can make browsers unhappy and unstable. The browser might crash. The user might close the browser after realising that they’ve been phished. Either way, when you’re running in the browser you don’t want it to be shut down. Migration lets you move laterally so that when that browser dies, you still have a functioning session on the box.
Sample scenario…
Here we have a meterpreter session running on an XP machine.
We’re currently running under the context of user ‘bob’…
And the process we’re running under is # 3548, which is an instance of internet explorer. As proof…
.. We can look at the process listing using the ‘ps’ command to see that it is indeed internet explorer.
As mentioned before, living in the browser isn’t really a great thing (certainly not IE6 as in this case), so we’d like to move somewhere else. Somewhere that is going to be quite stable, and that will stay alive for the duration of the logged in user’s session. Something like..
… explorer.exe. This is a good choice, because it’s quite likely that the user will have this running the whole time they’ve signed in. We can see the PID listed on the left, and so thats the PID that we want to migrate to.
We ask meterpreter to migrate to this pid using the ‘migrate’ command and passing it the process id. After a small wait, we get greeted with…
.. Message that tells us the process of migration has successfully completed. And we can easily verify this by …
… taking a look at the PID of the host process. Fantastic, we’re now living inside explorer.exe and if Internet Explorer crashes, or is shut down, then our session remains intact.
OK, so how does this work. To find out ….
Metasploit gets the easy part of this deal. All it has to do is perform some basic checks to make sure we’re not doing anything stupid such as migrating to something that doesn’t exist or
After attempting to get a reference to the process via the existing meterpeter session, MSF checks to make sure that the process exists (point at it)
That we actually have permissions to move to it (point at it)
And that we aren’t attempting to migrate into the process we already live in. Pretty simple stuff.
To be able to kick off a new instance of meterpreter we need to generate a new metsrv payload with the right “bits” in it. We can’t simply copy the contents of what’s running in memory because:
It might have been modified by an external process
The target architecture might not be the same as the source
Metsrv will probably be running at a new address and require relocation
Imports will probably be loaded into different areas of memory and will need to be fixed up
Easier to just kick off the RDI process again.
MSF checks the architecture of the target and creates a new stager class that has the appropriate RDI included based on architecture. With that set up it…
… creates a new instance of the stager class, points that class’s datastore at the location of the metsrv binary on disk, and then generates the stage payload. This is a block of bytes that’s effetively ready to be sent down the wire.
Next it creates a request packet instance. When the request is created we have to tell it the name of the command that we want Meterpreter to execute. Here it’s ‘core_migrate’. We then add all of the parameters we’re interested in:
Process ID
Size of the migrate payload
The migrate payload itself
And the target process’s architecture
Technically we don’t need to pass that last one in, but we already know it here, so it’s helpful to send that along as well so that Meterpreter doesn’t have to go and figure it out for itself.
But what are all these TLV things? They’re actually pervasive in the communications with Meterpreter, so …
Here are the definitions for the TLV identifiers that we use in migration. It shows that we expect the PID, Length and Architecture to be unsigned integer values and the payload to be a string of chars/bytes.
These definitions MUST MATCH on the Meterpreter side. If they don’t, then bad things happen. So if you look at the source of meterpeter you’ll see a bunch of defines that match these. I won’t show them here, as there’s already enough code on screen.
Back to ruby land, and now the calls to “add_tlv” should be quite self explanatory. Behind the scenes this function knows how to encode those values thanks to the meta types and it writes each of them to a binary blob behind the scenes. Now that we’re ready…
… we can just send the message to Meterpreter.
Given that we’re producing handles and information that is to be shared in the remote process, we need to be able to copy that over. To do this, we define a structure that contains the …
… event handle and a pointer to the start of the patched metsrv payload (as we saw when we first exploited). We also have a location to store the duplicated socket information that will allow us to take over comms. Notice how these are part of a UNION. For those who don’t know, UNIONS allow us to define variables that actually share the same memory location as other variables. So here, hEvent is at the same location in memory as bPadding1 and lpPayload shares the same memory location as bPadding2. Why are we doing this? Well this allows us …
.. To force 8-byte alignment for both of these variables. The compiler has to make sure there is enough space to store the biggest of all the variables in the union and hence each variable will have 8 bytes of space. If we didn’t do this, HANDLE and LPVOID types would be just 4 bytes on the 32 bit version and 8 bytes on the 64 bit version. This means that the whole structure has a consistent size across both 64 and 32 bit processes helps us avoid problems when migrating between these architectures.
As I mentioned before, there are normal commands that run in the context of the server thread and special commands that run in the context of the dispatch thread. Migrate is one of those special cases which needs to be run under the dispatch thread because it needs to be able to tell the dispatcher to stop executing. This is why we use the COMMAND_INLINE_REQ macro to declare the binding between the “core_migrate” id and the function which handles it.
While the function header isn’t particularly interesting I wanted to show it because of the …
… packet parameter. This is the structure that contains all of the TLV values that were passed to the message from MSF and we will be referencing it to extract the migration paramters.
Here’s where meterpreter pulls all of of the values of the packet ready for use. The C helper functions used to extract the values have to explicitly define the return types (Ruby doesn’t have to do this) and given that C doesn’t support overloaded functions with different return types we have to have different functions altogether. This isnt so much of a problem though.
We grab the PID, target architecture, payload and it’s length out of the packet, and move on.
Moving onto some actual functionality we can see we make use of the …
… OpenProcess call to get a handle to the process based on its ID. Note the excessive list of flags we have to pass in. We need each of these permissions to migrate, so if the call fails it’s an indication that one of those permissions is not possible to acquire.
If this succeeds we then check to see if Meterpreter is running with …
… a plain TCP connection. If so, this means that we have to do some extra work to keep communications going. Given that HTTP/HTTPS is stateless, a migrated HTTP/HTTPS meterpreter doesn’t need to do anything special to carry over comms, instead it just makes the next request instead of the old one. With TCP we want to retain the socket, so we …
… duplicate the socket information so that we can share it with the new process. Note that the value we are storing this result in the ctx.info parameter, which is part of the MIGRATE CONTEXT we looked at before.
With the socket stuff out of the way we move on to creating sync primitives. This begins with the …
… creation of an event handle using CreateEvent. This collection of parameters means that we want an manual reset event that is initially unset. With the handle created, we need to …
… duplicate it. The call to DuplicateHandle lets us specifiy the source process/handle and target process/handle. Note that the target handle is being stored in the migration context as well.
Next we need to figure out which migration stub we need. This is easily done thanks to already knowing the destination architecture and so we know, in our case, we ….
… need to use the 32 bit version. But what is a migration stub? Why do we need it? We’ll dive into that in a bit. Here all we need to know is that there’s a code stub that needs to execute to kick off migration, and this is it.
We attempt to allocate some memory in the target process…
… while making it RWX (point at page_execute_readwrite). This is important because we need to be able to write the code to the process, but we also need to be able to execute the payloads that we write to it as well. Note that the size of the payload is made up of:
Migrate stub length
Size of the migrate context
Payload length
All of these things need to be written so we allocate mem for all in this order. Remember how our migrate context contains a pointer to the start of the metsrv payload that has the reflective loader stub in it? Well, we need to …
… point the lpPayload parameter at this location. We know that this location is going to be offset from the start of the allocated memory. So we start there, add the size of the stub (which gets written first) and the size of the migrate context (which is written second). Then we actually write …
… the contents of each to the allocated memory block. Starting with the migrate stub …
… then with the migrate context…
… and finally the patched metsrv payload that was given to us by MSF.
We’re almost there! All we need to do is kick off a new thread in the target process. Seems simple right? Unfortunately not.
There we go. Clear? …
… Kidding. Let’s take a look at the ASM.
Don’t be scared or put off by the ASM. This is a little easier to digest than the stuff we looked at before. The first thing we do is …
… store the migrate context pointer in the esi register, as this gives us the ability to offset easily into the structure to access the handles/pointers we need. We then …
… jump (via a call) to the ‘start’ label which is past the block API. Now, the …
… block API include is a stack of assembler that allows us to invoke API functions without knowing where they are, nor using null-terminated strings to find them. This block API is another gem from Stephen fewer. It means that we can push parameters onto the stack, then push a HASH value which identifies the function and the library it lives in, then call the API and it will resolve the function for us and invoke it. It also sets up the stack so that the function call returns control directly back to us. Super clever stuff. So once the CALL instruction has been executed, we land …
… after the start label and pop the address of the block API “API call” functionality into the base pointer register. This allows us to use ebp to call the block api whenever we need it. Next we …
… invoke the block API and tell it to call the LoadLibrary function from Kernel32, giving it the string ‘ws2_32’ which is the name of the Windows socket library. We then call the …
… block API again and this time tell it to initialise the windows socket library by calling the WSAStartup function that lives in the library we loaded in the previous step. Windows sockets need to be initialised before they can be used, and given that we don’t know if the target process has initlaised the socket lib already, we do it again to be sure. Doing it twice doesn’t cause any problems.