Saturday, September 4, 2010

Using the volatile keyword to prevent float precision problems

Consider the following code:


int main() {
for(float i = 0; i < 100; i++) {
float f = i / 100.0f;
if (f != i / 100.0f) {
cout << i << ", ";
}
}
cout << endl;
return 0;
}


One should expect the above code to not print out anything, since the 2 operations are obviously equal.
Ironically, if you compile this code in msvc or gcc, you will most-likely get a bunch of numbers spammed to the console.

Why is this? Well it seems that the compilers take liberties with floating point calculations; even though you're dealing with single-precision 32bit floats, the compilers may use the full 80bit precision of the x87 FPU in calculations, or may even use double-floating point arithmetic for one of the operations, and use single-floating point for another one... mix and matching precisions, causing problems...
In the end, the different precision of the operations will cause the results of two seemingly equal operations to be unequal.

One solution is to use the volatile keyword.
The volatile keyword essentially makes the compiler always write-back values to memory once an operation is preformed, and it always reloads the value from memory when it will be used again.
This allows us to force the compiler to truncate the values to 32bit floats when it writes them to memory, and when it loads the value again it will be limited to the precision of a 32bit float.

So now we can do:

int main() {
for(float i = 0; i < 100; i++) {
volatile float f1 = i / 100.0f;
volatile float f2 = i / 100.0f;
if (f1 != f2) {
cout << i << ", ";
}
}
cout << endl;
return 0;
}


This will fix the problems we were having before, and not print out anything.

There are many papers and discussions about ways to compare floating point values, and I won't go into more detail here.
This site lists a few methods, and the problems they could have.
So I suggest reading that if you're interested.

2 comments:

  1. Volatile means the variable has to be reloaded from memory every single time its used, then stored again. For x86 code using the x87 FPU, this will round to 32bit at every step. You can also get this behaviour with gcc -ffloat-store.

    If you're using SSE instead of x87 math, since all x86 hardware has supported SSE for over 10 years, calculations are already done in single-precision in SSE registers. (gcc -mfpmath=sse is the default for 64bit code, because SSE2 is a required part of x86-64, and FP args are passed / returned in SSE registers. It's not the default for 32bit, because the ABI still passes float return values in ST(0).)

    ReplyDelete