# A question regarding float and double

Show 50 post(s) from this thread on one page
Page 1 of 2 12 Last
• July 15th, 2013, 06:38 PM
LarryChen
A question regarding float and double
Suppose we want to decide the position of point in a float or a double, here is what I do,
Code:

```float d; int count = 0; while(d!=(int)d) {     d *= 10;     count++; }```
Here count is the position of point. Suppose d = 3.1415, at the end of code, count = 7 and d = 31415002. Basically this is not what I want. I expect at the end, d = 31415 and count = 4. But it is even worse when d is a double. Ｉencounter an infinite loop. Why? And what is reliable way to do what I expect to do? Thanks.
• July 15th, 2013, 07:51 PM
GCDEF
Re: A question regarding float and double
Quote:

Originally Posted by LarryChen
Suppose we want to decide the position of point in a float or a double, here is what I do,
Code:

```float d; int count = 0; while(d!=(int)d) {     d *= 10;     count++; }```
Here count is the position of point. Suppose d = 3.1415, at the end of code, count = 7 and d = 31415002. Basically this is not what I want. I expect at the end, d = 31415 and count = 4. But it is even worse when d is a double. Ｉencounter an infinite loop. Why? And what is reliable way to do what I expect to do? Thanks.

http://lmgtfy.com/?q=floating+point+precision
• July 15th, 2013, 09:39 PM
Paul McKenzie
Re: A question regarding float and double
Quote:

Suppose d = 3.1415,
It isn't.
Quote:

Ｉ encounter an infinite loop. Why?
0.1415 cannot be represented exactly as a sum of inverse powers of 2, therefore it cannot be represented exactly as binary, so it is approximated. Then your code multiplies this inexact binary floating point number by 10.

The bottom line is not to use floating point variables to control how many times a loop will execute. Always use integers. Inside the loop you can do floating point calculations, but never use floating point calculations to determine loop constraints.

http://www.parashift.com/c++-faq/floating-pt-errs.html
http://www.parashift.com/c++-faq/flo...int-arith.html
http://www.parashift.com/c++-faq/flo...nt-arith2.html

Regards,

Paul McKenzie
• July 16th, 2013, 05:42 AM
2kaud
Re: A question regarding float and double
The number of 'digits' to the right of a '.' in a stored floating-point variable is effectively unknown as the internal representation is stored in binary and not every floating-point number can be represented exactly in binary (the same way that 1/3 cannot be respresented exactly in decimal). Hence any attempt to determine this number is not guaranteed to produce the value expected. One possible way of doing it would be to convert the floating-point value to a string and find the position of the '.'.
• July 16th, 2013, 06:50 AM
2kaud
Re: A question regarding float and double
You might like to experiment with this

Code:

```#include <stdio.h> #include <string.h> int main() { const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f}; const int noval = sizeof(f)/sizeof(f[0]); char        s[100]; char        *end,         *pt;         for (int i = 0; i < noval; i++) {                 sprintf(s, "%f", f[i]);                 for (end = strrchr(s, 0) - 1; *end == '0' && end != s; *end-- = 0);                 printf("%-12s  %i\n", s, (pt = strchr(s, '.')) ? end - pt : 0);         }         return 0; }```
This produces the following output

Code:

```3.141596    6 12.34      2 1.11        2 2.234      3 2.          0 45.66      2 45.669998  6 23.4534    4 23.453501  6```

Note the results of the bottom 4 which are set as 45.66, 45.67, 23.4534 and 23.4535.
• July 16th, 2013, 01:24 PM
LarryChen
Re: A question regarding float and double
Thanks so much for responding. But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?
• July 16th, 2013, 02:53 PM
Paul McKenzie
Re: A question regarding float and double
Quote:

Originally Posted by LarryChen
Thanks so much for responding. But I am just wondering if there is any reliable way to

Look at the links I gave you. Now given that, do you think there is any way to do "exact" things with floating point numbers? Even if you claim you know a way, there will be that stray number that thwarts all of your work. So it isn't worth trying to fight with doubles to get them to do things they are not designed for.

Floating point numbers were never designed for what you're trying to do. It is accepted by persons using double that the numbers represented and calculations made will be approximations, given that the approximate values are within a certain tolerance. You see it for yourself -- you claimed that
Code:

`double d = 3.1415;`
is what you wrote, but this value cannot be represented as an exact binary floating point value. So already you're in trouble.

Either:

1) use strings (by converting the double to a string and working with the converted string), or
2) use fixed point numbers, or
3) a library geared toward financial-based calculations, where approximations are not used and the numbers/calculations are always correct.

Regards,

Paul McKenzie
• July 16th, 2013, 03:12 PM
2kaud
Re: A question regarding float and double
Quote:

Originally Posted by LarryChen
Thanks so much for responding. But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?

There is no reliable way even by converting as I stated in my post #4 and Paul has stated in his posts. Why do you want to know? What exactly are you trying to achieve?
• July 17th, 2013, 08:45 AM
OReubens
Re: A question regarding float and double
what you're trying won't work for very big numbers because a float and a double can hold values that far exceed the possible values you can store in an int. ( a number bigger than 2^31 will overflow, and cause an infinite loop).
For a similar reason, it won't work for very small numbers such as 0.000000001, not because of overflow, but because of accumulated errors each time you multiply by 10.

If you can accept that it only needs to function for a limited and predetermined range of float/doubles, then you may be able to get some code working.
So if for example you can accept restrctions like it only needs to work "For any number between 0 and 999 with at most 4 decimal digits after the decimal point", then you can adjust your code to work within those bounds. But without knowing those acceptable restriction it's impossible to even state whether or not your question can be answered. Regardless of if you calculate it "the hard way" or convert it to a string and count characters. The post by 2Kaud (#5) is subject to these as well. Or put another way, if the detour to string solves your answer, then it can equally be solved by a pure mathematical method.

regardless of your soft restrictions, your hard restrictions are going to be that you can't exceed the limitations of the double type, nor can you get any reliable answer if the error tolerance on any intermediate value would exceed your maximum accrued error. This is why float is unsuitable for many things, it simply doesn't have enough precision for anything that requires even the slightest amount of accuracy.

But you haven't answered the real question... What are you actually trying to do ?
• July 22nd, 2013, 09:43 AM
razzle
Re: A question regarding float and double
Quote:

Originally Posted by LarryChen
But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?

Lets assume the fractional part of a floating point number is interpreted as an integer with a finite number of decimal digits.

Then the algorithm you suggest doesn't work because you compare for equality. As has been said, due to the nature of floating points that's not a good idea. You need to modify the algoritm to check whether d and int(d) are very close rather than equal, like this for example,

Code:

```float d=3.1415f; int count = 0; while ((abs(d-int(d)))/d > 0.000001) {     d *= 10;     count++; } std::cout << d << "/" << count << std::endl;```
• July 22nd, 2013, 03:07 PM
Re: A question regarding float and double
Quote:

Originally Posted by LarryChen
... I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?

Of course there is: YOU decide how many digits you want to show after the decimal point! :)
Consider above-mentioned 1/3. Would you prefer 0.33 or 0.3333333333?
• July 22nd, 2013, 05:53 PM
2kaud
Re: A question regarding float and double
Please can some kind guru explain what is happening here as I don't understand.

This code

Code:

```#include <iostream> #include <iomanip> #include <math.h> int main() { const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f}; const int noval = sizeof(f)/sizeof(f[0]);         for (int fi = 0; fi < noval; fi++) {                 float d = f[fi];                 int count = 0;                 while ((abs(d - int(d))) / d > 0.000001f) {                         d *= 10.0f;                         count++;                         //std::cout << "";                 }                 std::cout << std::setw(10) << f[fi] << "  " << count << std::endl;         }         return 0; }```
gives as output
3.1416 6
12.34 2
1.11 2
2.234 6
2 0
45.66 5
45.67 5
23.4534 5
23.4535 4

whereas this code

Code:

```#include <iostream> #include <iomanip> #include <math.h> int main() { const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f}; const int noval = sizeof(f)/sizeof(f[0]);         for (int fi = 0; fi < noval; fi++) {                 float d = f[fi];                 int count = 0;                 while ((abs(d - int(d))) / d > 0.000001f) {                         d *= 10.0f;                         count++;                         std::cout << "";                 }                 std::cout << std::setw(10) << f[fi] << "  " << count << std::endl;         }         return 0; }```
gives

3.1416 6
12.34 2
1.11 2
2.234 3
2 0
45.66 2
45.67 2
23.4534 5
23.4535 4

The only change being the highlighed line which should have no effect on the calculations.

In the project property Floating-point consistency is set to Default Consistency. However, if I change this value to Improve Consistency then both versions of the program give the same result????:confused::confused:
• July 23rd, 2013, 01:24 AM
razzle
Re: A question regarding float and double
Quote:

Originally Posted by 2kaud
Please can some kind guru explain what is happening here as I don't understand.

I cannot say but this works with your test cases,

Code:

```const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f}; const int noval = sizeof(f)/sizeof(f[0]); for (int fi = 0; fi < noval; fi++) {         float d = f[fi];         int count = 0;         while ((std::abs(d - int(d)))/d > 0.000001f) {                 count++;                 d = f[fi]*std::pow(10.0f,count);         }         std::cout << int(d) << "/" << count << std::endl; }```
As you can see I've changed the accumulating d calculation to a direct multipliction with powers of 10 because otherwise small errors may accumulate and become big.

There are two major no-nos when doing floating point:

1. don't compare for equality, and
2. don't accumulate constants.

They're now both fixed in the above code. It should also work if modified for doubles and then the zero range (the 0.000001) can be made even smaller. It should be close to the smallest that can be represented. If there's a choise between float and double always go for the latter.
• July 23rd, 2013, 02:08 AM
superbonzo
Re: A question regarding float and double
Quote:

Originally Posted by 2kaud
In the project property Floating-point consistency is set to Default Consistency. However, if I change this value to Improve Consistency then both versions of the program give the same result????:confused::confused:

floating point calculations are not designed to be deterministic, compilers may use higher precision registers to store intermidiate results, may rearrange expressions to algebraically equivalent but faster versions ( x/y as x*1/y, for example ), etc ... so, my guess is that the cout expression simply triggered a different scenario and the compiler reacted accordingly. Some good samaritan could peek at the asm to confirm this ... :)
• July 23rd, 2013, 07:40 AM
OReubens
Re: A question regarding float and double
Quote:

Originally Posted by razzle
Lets assume the fractional part of a floating point number is interpreted as an integer with a finite number of decimal digits.

And this is already where this fails because this is NOT how a floating point works.

Even with your "close together" test this isn't going to reliable
and that is especially true for a float because you're going to be exceding the error tolerance at the end.
your sample essentially assumes a float has 7 decimal positions of accuracy MORE than whatever result you're getting, and this isn't the case, the absolute accuracy is 7 decimal positions (not even counting accumulated error in each intermediate).
Show 50 post(s) from this thread on one page
Page 1 of 2 12 Last