Thursday, July 8, 2010

xor swap trick and add/sub swap trick

This is a pretty common bitwise trick most veteran coders know about, but its probably a good bitwise trick to start this section out with.

If you have 2 integers, x and y, you can swap their values normally by doing:

// Normal Swap Function
_f void swap(int& x, int& y) {
    int z = x;
    x = y;
    y = z;
}


But this requires a temporary variable/register to be used (the "int c").

You can do this same functionality without using a temporary variable at all, using the xor swap trick:

// XOR Swap Function
_f void swap(int& x, int& y) {
    assert(&x != &y);
    x ^= y;
    y ^= x;
    x ^= y;
}


The result ends up being the same as the first function, but notice now it doesn't use any temporary variables.
This trick can be useful if you're doing some low-level asm work, or reg-alloc in a recompiler, or for any reason you don't have an extra register to spare and your instruction set doesn't have an xchg opcode. Admittingly, if you're just sticking to c++ code, might just be better to use the normal swap routine instead.

Also beware! This trick has a big problem if &x == &y, that is, if x and y are the same exact variable. In that case instead of keeping the values the same, you end up zeroing out the value!
So the xor swap trick should only be used when you know you are dealing with 2 distinct 'x' and 'y' variables. That is why we put the assert(&x != &y) line in this function, so on debug builds we will get an error if the references of x and y point to the same exact variable.

Lastly I'll point out that you can do the same trick using add and sub operations, but its a bit messier and not as useful:

// Add/Sub Swap Function
_f void swap(int& x, int& y) {
    assert(&x != &y);
    x += y;
    y  = x - y;
    x -= y;
}


The problem with this code is the second instruction.
For many instruction sets (for example SSE, MMX, or x86-32), you're stuck with "dest, src" syntax for instructions.
This means that the compiled code for this will still end up having to use some type of temporary register or the stack to compute this.

If you're dealing with another architecture however that has 3-operand syntax "dest, src1, src2", then this is more useful since you can do the second line with just 1 instruction.

Notes:

I use "_f" as a macro for "__forceinline" so:
// Forceinline Macro for in-lining functions
#define _f __forceinline


Also note that this trick can be extended to be used with floats, doubles, and etc...
But you have to be sure to do the operations as 32bit integers or 64bit integers respectively.

Here's an example of swapping 2 floating point singles with this trick:

// Add/Sub Swap Function
_f void swap(int& x, int& y) {
    assert(&x != &y);
    x ^= y;
    y ^= x;
    x ^= y;
}

// Float Swap Function
_f void swap(float& x, float& y) {
    swap((int&)x, (int&)y);
}


Technically this isn't very useful when sticking with high-level c++, especially since floating point operations are generally compiled to use the x87 fpu (so the floats will be moved back into gprs, and end up being slower than a regular swap); also this will confuse the compiler more so that's another reason it won't generate as good code; but if you're dealing with SSE directly, the xor trick can be useful.

2 comments:

  1. Double-check the use of your temporary variable, c, in your Normal Swap Function (at the top).

    ReplyDelete
  2. Thanks greg for spotting the typo, I corrected the error.

    ReplyDelete