Horay, finally official multi-byte NOPs are supported by Intel 80×86 processors. So OK, not all processors support it yet, and you have to check the result of a cpuid instruction to know whether you can use it or not, but it’s a good start nevertheless. It is unnecessary to mention that this kind of NOP is a real NOP, and yet I just did. :)
So why do we care you ask?
Well, probably you certainly don’t. But some code generator tools will use it in the future, or hand written code in assembly. It will make the code more understandable and it will fit all sizes you need from 2 bytes up to 9 bytes. However, to be honest, I can’t really grasp the reason why they actually support it. I mean, one could write X single byte NOPs. But then I thought, the processor will spend time for decoding the instructions in the pipeline until it will realize it’s a NOP. So I guess the multi-byte NOP will be faster than a varying number of NOPs in a row. I tried to think what’s the benefit of such thing. Most of the NOPs or pseudo NOPs are used to align code to 8 or 16 bytes boundaries. This is probably for caching reasons and faster memory read operations. But if the NOPs are used for alignment usually they will follow a branch and (sometimes) even conditional branch, so they might not get to be executed most of the times. So why the effort for making such a new instruction? I can’t come up with a good answer. Got any idea? Ah, and if you were to align your code and it won’t run then why not dumping zeroes as a padding? And it’s not like 80×86 architecture has delayed branch… So they had a good reasoning for coming up with this instruction. The only real answer I came up with is for time critical code where you have to measure your code in MS…
Maybe you already saw some pseudo NOPs in the past. These are real instructions that don’t affect registers or flags of the current execution context. You can come up with many variations of such NOPs using the LEA instruction:
LEA REG, [REG+0]
And you can come up with less popular NOPs, MOV REG, REG…
The advantage of the LEA instruction is that it accepts a complex addressing mode operand so you can make the whole instruction bigger easily by using different sizes of the immediate ‘0’. If you don’t understand how the ModR/M is formatted, you will have to assemble the code and disassemble it until it fits the size you require. Not mentioning that you can’t control 100% of the code generated by the assembler that you would prefer to use the DB thingy directly to emit the pseudo NOP. The multi-byte NOP accepts the same source operand as LEA accepts, therefore its size varies.
Well, you might find yourself find of ADD EAX, 0 as a NOP. Probably because in math it doesn’t mean anything special. But while executed it affects the flags. Though, you might say it’s not a big deal and that depends where you dump the pseudo NOPs in your code. However, it seems that even a popular assembler had some bad issues with generating pseudo NOPs. And believe me, when you trust a tool and it misbehaves, you might get crazy until you realize who’s to blame!
Now I have to make diStorm support it ;)
NOP NOP NOP!!!1