What is the purpose of XORing a register with itself?
AssemblyX86Assembly Problem Overview
xor eax, eax
will always set eax
to zero, right? So, why does MSVC++ sometimes put it in my executable's code? Is it more efficient that mov eax, 0
?
012B1002 in al,dx
012B1003 push ecx
int i = 5;
012B1004 mov dword ptr [i],5
return 0;
012B100B xor eax,eax
Also, what does it mean to do in al, dx
?
Assembly Solutions
Solution 1 - Assembly
Yes, it is more efficient.
The opcode is shorter than mov eax, 0
, only 2 bytes, and the processor recognizes the special case and treats it as a mov eax, 0
without a false read dependency on eax
, so the execution time is the same.
Solution 2 - Assembly
Also to avoid 0s when compiled as used on shell codes for exploitation of buffer overflows, etc. Why avoid the 0 ? Well, 0 represents the end of string in c/c++ and the shell code would be truncated if the mean of exploitation is a string processing function or the like.
Btw im referring to the original question: "Any reason to do a “xor eax, eax”?" not what the MSVC++ compiler does.
Since there's some debate in the comments about how this is pertinent in the real world, see this article and this section on Wikipedia.
Solution 3 - Assembly
xor eax, eax
is a faster way of setting eax
to zero. This is happening because you're returning zero.
The in
instruction is doing stuff with I/O ports. Basically reading a word of data from the port specified dx
in and storing it in al
. It's not clear why it is happening here. Here's a reference that seems to explain it in detail.
Solution 4 - Assembly
Another reason to use XOR reg, reg
or XORPS reg, reg
is to break dependency chains, this allows the CPU to optimize the parallel execution of the assembly commands more efficiently (even it it adds some more instruction throughput preasure).
Solution 5 - Assembly
The XOR operation is indeed very fast. If the result is to set a register to zero, the compiler will often do it the fastest way it knows. A bit operation like XOR might take only one CPU cycle, whereas a copy (from one register to another) can take a small handful.
Often compiler writers will even have different behaviors given different target CPU architectures.
Solution 6 - Assembly
from the OP > any reason to do "xor eax,eax" return 0; 012B100B xor eax,eax ret <-- OP doesn't show this
The XOR EAX,EAX simply 0's out the EAX register, it executes faster than a MOV EAX,$0 and doesn't need to fetch immediate data of 0 to load into eax
It's very obvious this is the "return 0" that MSVC is optimizing EAX is the register used to return a value from a function in MSVC
Solution 7 - Assembly
xor is often used to encrypt a code for example
mov eax,[ecx+ValueHere]
xor eax,[ecx+ValueHere]
mov [ebx+ValueHere],esi
xor esi,[esp+ValueHere]
pop edi
mov [ebx+ValueHere],esi
The XOR instruction connects two values using logical exclusive OR remember OR uses inclusive OR To understand XOR better, consider those two binary values:
1001010110
0101001101
If you OR them, the result is 1100011011 When two bits on top of each other are equal, the resulting bit is 0. Else the resulting bit is 1. You can use calc.exe to calculate XOR.