I had recently thought of an extreme way of implementing security by obscurity and wanted to ask you guys if it's possible.
Would a person with no access to special processor documentation be able to change the CPU's microcode in order to obfuscate the machine's instruction set?
What else would need to be changed for a machine to boot with such a processor - would BIOS manipulation be enough?
Would a person with no access to special processor documentation be able to change the CPU's microcode in order to obfuscate the machine's instruction set?
What else would need to be changed for a machine to boot with such a processor - would BIOS manipulation be enough?
You're very much getting into the realm of "Here be dragons"
when you look into hardware manipulation like this. I don't know of any
research or in-the-wild attack that has done any practical
experimentation with this, so my answer will be purely academic.
First, it's probably best if I explain a bit about how microcode works. If you're already clued up on this stuff, feel free to skip ahead, but I'd rather include the details for those who don't know. A microprocessor consists of a huge array of transistors on a silicon die that interconnect in a way that provides a set of useful basic functions. These transistors alter their states based on internal changes in voltage, or on transitions between voltage levels. These transitions are triggered by a clock signal, which is actually a square wave that switches between high and low voltage at a high frequency - this is where we get "speed" measurements for CPUs, e.g. 2GHz. Every time a clock cycle switches between low and high voltage a single internal change is made. This is called a clock tick. In the simplest devices a single clock tick might constitute a whole programmed operation, but these devices are extremely limited in terms of what they're capable of doing.
As processors have gotten more complex, the amount of work that needs to be done at the hardware level to provide even the most basic operations (e.g. an addition of two 32-bit integers) has increased. A single native assembly instruction (e.g.
Let's look at an extremely simplistic version of a memory read, for the instruction
All of the above is a simplification - the real operation might involve a lot more work, or might be handled by a dedicated internal device. As such, you might be looking at a large sequence of microinstructions that do very little on their own but add up to a single instruction. In some cases special microinstructions are used to trigger asynchronous internal hardware operations that handle a particular operation, designed to improve performance.
As you can see, microcode is immensely complicated. Not only would it wildly vary between CPU types, but also between release versions and revisions. This makes it a difficult thing to target - you can't really tell what microcode is programmed into the device. Not only that, but the way the microcode is programmed into the chip is also specific to each processor. On top of that, it's undocumented and checksummed, and potentially requires some signature checks too. You'd need some serious hardware to reverse engineer the mechanisms and checks.
Let's assume for a moment that you could overwrite microcode in a useful way. How would you make it do anything useful? Keep in mind that each code simply shifts some values around in the internals of the hardware, rather than a real operation. Obfuscating opcodes by juggling microcode around would require a complete custom OS and bootloader, but the BIOS would (likely) still work. Unfortunately more modern systems use UEFI rather than the old BIOS spec, which involves some execution of code on the CPU in real mode. This means you'd need an entirely new BIOS and OS, all written from scratch. Hardly a useful obfuscation method. On top of that, you may not even be able to remap instructions, because the seemingly arbitrary byte values aren't so arbitrary - the individual bits map to codes that select different areas of the CPU internals. Changing them might break the CPU's ability to even parse the instruction data.
A more interesting exercise would be to implement a new instruction that transitions you from ring3 to ring0 and another that switches back, all without performing any checks. This would allow you to do some fun stuff with privilege escalation without ever needing OS-specific backdoors.
http://security.stackexchange.com/questions/29730/processor-microcode-manipulation-to-change-opcodes
First, it's probably best if I explain a bit about how microcode works. If you're already clued up on this stuff, feel free to skip ahead, but I'd rather include the details for those who don't know. A microprocessor consists of a huge array of transistors on a silicon die that interconnect in a way that provides a set of useful basic functions. These transistors alter their states based on internal changes in voltage, or on transitions between voltage levels. These transitions are triggered by a clock signal, which is actually a square wave that switches between high and low voltage at a high frequency - this is where we get "speed" measurements for CPUs, e.g. 2GHz. Every time a clock cycle switches between low and high voltage a single internal change is made. This is called a clock tick. In the simplest devices a single clock tick might constitute a whole programmed operation, but these devices are extremely limited in terms of what they're capable of doing.
As processors have gotten more complex, the amount of work that needs to be done at the hardware level to provide even the most basic operations (e.g. an addition of two 32-bit integers) has increased. A single native assembly instruction (e.g.
add eax, ebx
)
might involve quite a lot of internal work, and microcode is what
defines that work. Each clock tick performs a single microcode
instruction, and a single native instruction might involve hundreds of
microcode instructions.Let's look at an extremely simplistic version of a memory read, for the instruction
mov eax, [01234000]
, i.e. move a 32-bit integer from memory at address 01234000
into an internal register. First, the processor has to read the
instruction from its internal instruction cache, which is a complicated
task in itself. Let's ignore this for now, but it involves a lot of
various operations inside the control unit (CU) that parse the
instruction and prime various other internal units. Once the control
unit has parsed the instruction, it then has to execute a group of
microinstructions to perform the operation. First, it needs to check
that the system memory pipeline is ready for a new instruction (remember
that memory chips take commands too) so that it can do a read. Next, it
needs to send a read command to the pipeline and wait for it to be
serviced. DDR is asynchronous, so it must wait for an interrupt to say
that the operation has completed. Once the interrupt is raised, the CPU
continues with the instruction. The next operation is to move the new
value from memory into an internal register. This isn't as simple as it
sounds - the registers you would normally recognise (eax, ebx, ecx, edx,
ebp, etc.) aren't fixed to a particular physical set of transistors in
the chip. In fact, a CPU has a lot more physical internal registers than
it exposes, and it uses a technique called register renaming to
optimise the translation of incoming, outgoing and processed data. So
the actual data from the memory bus has to be moved into a physical
register, then that register has to be mapped to an exposed register
name. In this case we'd be mapping it to eax.All of the above is a simplification - the real operation might involve a lot more work, or might be handled by a dedicated internal device. As such, you might be looking at a large sequence of microinstructions that do very little on their own but add up to a single instruction. In some cases special microinstructions are used to trigger asynchronous internal hardware operations that handle a particular operation, designed to improve performance.
As you can see, microcode is immensely complicated. Not only would it wildly vary between CPU types, but also between release versions and revisions. This makes it a difficult thing to target - you can't really tell what microcode is programmed into the device. Not only that, but the way the microcode is programmed into the chip is also specific to each processor. On top of that, it's undocumented and checksummed, and potentially requires some signature checks too. You'd need some serious hardware to reverse engineer the mechanisms and checks.
Let's assume for a moment that you could overwrite microcode in a useful way. How would you make it do anything useful? Keep in mind that each code simply shifts some values around in the internals of the hardware, rather than a real operation. Obfuscating opcodes by juggling microcode around would require a complete custom OS and bootloader, but the BIOS would (likely) still work. Unfortunately more modern systems use UEFI rather than the old BIOS spec, which involves some execution of code on the CPU in real mode. This means you'd need an entirely new BIOS and OS, all written from scratch. Hardly a useful obfuscation method. On top of that, you may not even be able to remap instructions, because the seemingly arbitrary byte values aren't so arbitrary - the individual bits map to codes that select different areas of the CPU internals. Changing them might break the CPU's ability to even parse the instruction data.
A more interesting exercise would be to implement a new instruction that transitions you from ring3 to ring0 and another that switches back, all without performing any checks. This would allow you to do some fun stuff with privilege escalation without ever needing OS-specific backdoors.
http://security.stackexchange.com/questions/29730/processor-microcode-manipulation-to-change-opcodes