# Strange differences in float calculations

• November 25th, 2009, 08:57 AM
evertonland
Strange differences in float calculations
Dear all,

I'm doing some work with images, and I've come across the strangest thing:

I have an array of float values, if I initialize it to zero and sum a series of floats on the array itself I obtain one result.
On the other hand if I use an auxiliary variable as an accumulator and after summing all values I assign it to the corresponding position in my array, the result is slightly different to the former one.

I have no idea why this is and would appreciate if anyone could give me any tips here. I'm using g++ 4.3.3.
I post below the excerpt of the code in question and briefly comment on the variables.

- nCols, nRows: numbers of columns and rows of my images.
- u, meanu, normu: different images. The data elements are of type unsigned char for 'u' and float for the others.
- _(img, x, y): this is an accessor macro. The data is just stored in an array, so the macro computes the offset corresponding to the indexes x and y.

The commented lines are the alternative that yields different results.

Code:

```for (int x = limit; x < nCols - limit; x++) {     for (int y = limit; y < nRows - limit; y++) {         //float accum = 0.0;         _(normu,x,y) = 0.0;         for (int i = -halfWindow; i <= halfWindow; i++) {             for(int j= -halfWindow; j<= halfWindow; j++) {                 //  accum+=(float)(_(u,x+i,y+j)-_(meanu,x,y))*(_(u,x+i,y+j)-_(meanu,x,y));                                 _(normu,x,y)+=(float)(_(u,x+i,y+j)-_(meanu,x,y))*(_(u,x+i,y+j)-_(meanu,x,y));             }         }         //_(normu,x,y)=accum;     } }```
You can think of _(normu,x,y) as normu->gray[x + y*nCols], were gray is an array of type float.

If you need any more details let me know and I'll be happy to provide them, I tried to keep the post as short as possible.

Thank you!
• November 25th, 2009, 09:12 AM
Lindley
Re: Strange differences in float calculations
Floating point is simply inaccurate to a small degree. 32-bit floats especially so. Working with floating point, you simply have to accept that and know the rules which let you minimize the error.

As for why it's coming out differently, I suspect the compiler is optimizing the two approaches differently.
• November 25th, 2009, 10:13 AM
monarch_dodra
Re: Strange differences in float calculations
I didn't take the time to read your code, but as Lindley said, floats are just plain un-exact (not to confuse with inaccurate).

When you use floats, you have to accept things like this:
Code:

```a+b+c =/= c+b+a; a+b-a =/= b;```
As I said, I didn't read your code, but if you did ANYTHING in your code that changed the order in which you manipulated your data, then that is your explanation. Not that one version is wrong per say, they are just both slightly un-exact, but probably close to each other and accurate.

I'm not good at explaining why, but if you take the time to understand what a float is, it should become obvious to you.

floats are good, but please please please make sure you understand the hows and whys of them, or you will be in a world of pain sooner or later.
• November 25th, 2009, 10:43 AM
Paul McKenzie
Re: Strange differences in float calculations
Quote:

Originally Posted by evertonland
Dear all,

I'm doing some work with images, and I've come across the strangest thing:

I have an array of float values, if I initialize it to zero and sum a series of floats on the array itself I obtain one result.
On the other hand if I use an auxiliary variable as an accumulator and after summing all values I assign it to the corresponding position in my array, the result is slightly different to the former one.

The issue with floating point calculations has little to do with the compiler you're using, and everything to do with computer science and numerical analysis.

A binary computing machine, i.e. your computer, cannot represent most decimal floating point values exactly. If you know the math, try to represent 0.3 exactly in binary. You can't, and neither can the computer. So an approximation is made. So right there, you're in trouble.

The only decimal floating point numbers that can be represented in binary are ones that are floating point values that are sums of negative powers of 2, i.e. 1/2, 1/4, 3/8, 1/16, etc.

If you juggle floating point calculations around, use temp variables, etc. then you're upsetting the round-off apple cart. That is the bottom line.

Regards,

Paul McKenzie
• November 25th, 2009, 10:52 AM
evertonland
Re: Strange differences in float calculations
Thank you both for the quick replies.

@Lindley: I was aware that there are compiler optimizations that affect the result of float calculations, but I hadn't thought of it optimizing diffently floats and float arrays. Thanks for the idea, that might be it. Do you think this could still be the case even though I am compiling with the flag -O0, in order to turn off optimization?

@monarch_dodra: while looking around for the cause of what I'm telling you about I came across the kind of things you mention, but I think mine is stranger. I'll strip it to the bone for you:

Code:

```float x = a + b; float xArray[10]; xArray[0] = a + b; x != xArray[0];```
Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what? :)

• November 25th, 2009, 02:22 PM
monarch_dodra
Re: Strange differences in float calculations
Quote:

Originally Posted by evertonland
Thank you both for the quick replies.

@Lindley: I was aware that there are compiler optimizations that affect the result of float calculations, but I hadn't thought of it optimizing diffently floats and float arrays. Thanks for the idea, that might be it. Do you think this could still be the case even though I am compiling with the flag -O0, in order to turn off optimization?

@monarch_dodra: while looking around for the cause of what I'm telling you about I came across the kind of things you mention, but I think mine is stranger. I'll strip it to the bone for you:

Code:

```float x = a + b; float xArray[10]; xArray[0] = a + b; x != xArray[0];```
Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what? :)

My (quick) guess is that one of the calculations is optimized away during compile time, whereas the other one is done at run time. The way the compiler "calculates" a+b might be different from the way the compiled executable does it.

Or it might be something completely un-related, but that is my guess.

EDIT, could you provide us with your numeric values and code, I'd like to try it out on my machine.
• November 26th, 2009, 04:33 AM
evertonland
Re: Strange differences in float calculations

@monarch_dodra: If you want to play around with the code, I'd be happy to let you have it. I just have to check with my boss first. That being said, the code is tied to an image processing library that you'd have to compile yourself. Let me know if your curiosity trumps the bother this represents.

Off-topic noob question: Are there private messages on this forum? I looked at the FAQ but it appears to me that they are disabled.

Thanks.
• November 26th, 2009, 05:44 AM
laserlight
Re: Strange differences in float calculations
Quote:

Originally Posted by evertonland
Off-topic noob question: Are there private messages on this forum? I looked at the FAQ but it appears to me that they are disabled.

Off-topic noob answer: you may need to make a certain number of posts before they are enabled.
• November 26th, 2009, 02:41 PM
Zaccheus
Re: Strange differences in float calculations
Quote:

Originally Posted by evertonland
Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what? :)

Some CPUs (or FPUs) do the calculations in 80 bit on the chip but the compiler stores doubles as 64 bit (and floats as 32 etc) in RAM.

It is possible that your 'stand alone variable' is in fact held in a register, while the array is obviously in RAM.

Found this out during a project at work; we were going a bit nuts because changing seemingly unrelated code near floating point calculations was changing the results! :ehh:
• November 26th, 2009, 05:18 PM
Amleto
Re: Strange differences in float calculations
Does declaring floats/doubles as volatile help alleviate that unexpexted behaviour? It makes it less likely that values are stored in registers if my understanding is correct.
• November 27th, 2009, 02:37 AM
monarch_dodra
Re: Strange differences in float calculations
Quote:

Originally Posted by Amleto
Does declaring floats/doubles as volatile help alleviate that unexpexted behaviour? It makes it less likely that values are stored in registers if my understanding is correct.

Define what "un-exected behavior" is. Floating points calculations are by definition inaccurate. Relying on things like calling your floats volatile to hope they are only stored in registers is not only dangerous, but throw you into a world of non-portability, and machine/compiler/everything assumptions. The slightest change anywhere on your system could literally make your program explode.

Regardless of what you are doing, when you are using floating point calculations, you have to accept that your results might never be exact, or the same as another calculation that might theoretically provide the same result, but doesn't.

If you do that, then you won't even care that both results aren't the same, because your system was built to work on data that is not 100% accurate.
• November 27th, 2009, 10:36 AM
Zaccheus
Re: Strange differences in float calculations
Also, 'volatile' is the wrong keyword anyway, you'd want to use 'register', but most compilers ignore that keyword these days.

As monarch_dodra said, just write your code expecting inaccuracy.