CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com

# Thread: A question regarding float and double

#### Hybrid View

1. Senior Member
Join Date
Jul 2005
Posts
1,030

## A question regarding float and double

Suppose we want to decide the position of point in a float or a double, here is what I do,
Code:
```float d;
int count = 0;

while(d!=(int)d)
{
d *= 10;
count++;
}```
Here count is the position of point. Suppose d = 3.1415, at the end of code, count = 7 and d = 31415002. Basically this is not what I want. I expect at the end, d = 31415 and count = 4. But it is even worse when d is a double. Ｉencounter an infinite loop. Why? And what is reliable way to do what I expect to do? Thanks.

2. Elite Member Power Poster
Join Date
Nov 2003
Location
Florida
Posts
12,518

## Re: A question regarding float and double

Originally Posted by LarryChen
Suppose we want to decide the position of point in a float or a double, here is what I do,
Code:
```float d;
int count = 0;

while(d!=(int)d)
{
d *= 10;
count++;
}```
Here count is the position of point. Suppose d = 3.1415, at the end of code, count = 7 and d = 31415002. Basically this is not what I want. I expect at the end, d = 31415 and count = 4. But it is even worse when d is a double. Ｉencounter an infinite loop. Why? And what is reliable way to do what I expect to do? Thanks.
http://lmgtfy.com/?q=floating+point+precision

3. Elite Member Power Poster
Join Date
Apr 1999
Posts
27,449

## Re: A question regarding float and double

Suppose d = 3.1415,
It isn't.
Ｉ encounter an infinite loop. Why?
0.1415 cannot be represented exactly as a sum of inverse powers of 2, therefore it cannot be represented exactly as binary, so it is approximated. Then your code multiplies this inexact binary floating point number by 10.

The bottom line is not to use floating point variables to control how many times a loop will execute. Always use integers. Inside the loop you can do floating point calculations, but never use floating point calculations to determine loop constraints.

http://www.parashift.com/c++-faq/floating-pt-errs.html
http://www.parashift.com/c++-faq/flo...int-arith.html
http://www.parashift.com/c++-faq/flo...nt-arith2.html

Regards,

Paul McKenzie

4. ## Re: A question regarding float and double

You might like to experiment with this

Code:
```#include <stdio.h>
#include <string.h>

int main()
{
const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f};

const int noval = sizeof(f)/sizeof(f[0]);

char	s[100];

char	*end,
*pt;

for (int i = 0; i < noval; i++) {
sprintf(s, "%f", f[i]);
for (end = strrchr(s, 0) - 1; *end == '0' && end != s; *end-- = 0);
printf("%-12s  %i\n", s, (pt = strchr(s, '.')) ? end - pt : 0);
}

return 0;
}```
This produces the following output

Code:
```3.141596    6
12.34       2
1.11        2
2.234       3
2.          0
45.66       2
45.669998   6
23.4534     4
23.453501   6```

Note the results of the bottom 4 which are set as 45.66, 45.67, 23.4534 and 23.4535.

5. Senior Member
Join Date
Jul 2005
Posts
1,030

## Re: A question regarding float and double

Thanks so much for responding. But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?

6. Elite Member Power Poster
Join Date
Apr 1999
Posts
27,449

## Re: A question regarding float and double

Originally Posted by LarryChen
Thanks so much for responding. But I am just wondering if there is any reliable way to
Look at the links I gave you. Now given that, do you think there is any way to do "exact" things with floating point numbers? Even if you claim you know a way, there will be that stray number that thwarts all of your work. So it isn't worth trying to fight with doubles to get them to do things they are not designed for.

Floating point numbers were never designed for what you're trying to do. It is accepted by persons using double that the numbers represented and calculations made will be approximations, given that the approximate values are within a certain tolerance. You see it for yourself -- you claimed that
Code:
`double d = 3.1415;`
is what you wrote, but this value cannot be represented as an exact binary floating point value. So already you're in trouble.

Either:

1) use strings (by converting the double to a string and working with the converted string), or
2) use fixed point numbers, or
3) a library geared toward financial-based calculations, where approximations are not used and the numbers/calculations are always correct.

Regards,

Paul McKenzie
Last edited by Paul McKenzie; July 16th, 2013 at 03:14 PM.

7. ## Re: A question regarding float and double

Originally Posted by LarryChen
Thanks so much for responding. But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?
There is no reliable way even by converting as I stated in my post #4 and Paul has stated in his posts. Why do you want to know? What exactly are you trying to achieve?

8. Member +
Join Date
Jul 2013
Posts
576

## Re: A question regarding float and double

Originally Posted by LarryChen
But I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?
Lets assume the fractional part of a floating point number is interpreted as an integer with a finite number of decimal digits.

Then the algorithm you suggest doesn't work because you compare for equality. As has been said, due to the nature of floating points that's not a good idea. You need to modify the algoritm to check whether d and int(d) are very close rather than equal, like this for example,

Code:
```float d=3.1415f;
int count = 0;
while ((abs(d-int(d)))/d > 0.000001)
{
d *= 10;
count++;
}
std::cout << d << "/" << count << std::endl;```
Last edited by razzle; July 22nd, 2013 at 10:02 AM.

9. ## Re: A question regarding float and double

Please can some kind guru explain what is happening here as I don't understand.

This code

Code:
```#include <iostream>
#include <iomanip>
#include <math.h>

int main()
{
const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f};

const int noval = sizeof(f)/sizeof(f[0]);

for (int fi = 0; fi < noval; fi++) {
float d = f[fi];
int count = 0;
while ((abs(d - int(d))) / d > 0.000001f) {
d *= 10.0f;
count++;
//std::cout << "";

}
std::cout << std::setw(10) << f[fi] << "  " << count << std::endl;
}
return 0;
}```
gives as output
3.1416 6
12.34 2
1.11 2
2.234 6
2 0
45.66 5
45.67 5
23.4534 5
23.4535 4

whereas this code

Code:
```#include <iostream>
#include <iomanip>
#include <math.h>

int main()
{
const float f[] = {3.141596f, 12.34f, 1.11f, 2.234f, 2.0f, 45.66f, 45.67f, 23.4534f, 23.4535f};

const int noval = sizeof(f)/sizeof(f[0]);

for (int fi = 0; fi < noval; fi++) {
float d = f[fi];
int count = 0;
while ((abs(d - int(d))) / d > 0.000001f) {
d *= 10.0f;
count++;
std::cout << "";

}
std::cout << std::setw(10) << f[fi] << "  " << count << std::endl;
}
return 0;
}```
gives

3.1416 6
12.34 2
1.11 2
2.234 3
2 0
45.66 2
45.67 2
23.4534 5
23.4535 4

The only change being the highlighed line which should have no effect on the calculations.

In the project property Floating-point consistency is set to Default Consistency. However, if I change this value to Improve Consistency then both versions of the program give the same result????

10. Elite Member Power Poster
Join Date
Apr 2000
Location
Belgium (Europe)
Posts
4,626

## Re: A question regarding float and double

Originally Posted by razzle
Lets assume the fractional part of a floating point number is interpreted as an integer with a finite number of decimal digits.
And this is already where this fails because this is NOT how a floating point works.

Even with your "close together" test this isn't going to reliable
and that is especially true for a float because you're going to be exceding the error tolerance at the end.
your sample essentially assumes a float has 7 decimal positions of accuracy MORE than whatever result you're getting, and this isn't the case, the absolute accuracy is 7 decimal positions (not even counting accumulated error in each intermediate).

11. Elite Member Power Poster
Join Date
Aug 2000
Location
New York, NY, USA
Posts
5,656

## Re: A question regarding float and double

Originally Posted by LarryChen
... I am just wondering if there is any reliable way to decide the position of point in a float or a double without converting it to a string?
Of course there is: YOU decide how many digits you want to show after the decimal point!
Consider above-mentioned 1/3. Would you prefer 0.33 or 0.3333333333?

12. ## Re: A question regarding float and double

The number of 'digits' to the right of a '.' in a stored floating-point variable is effectively unknown as the internal representation is stored in binary and not every floating-point number can be represented exactly in binary (the same way that 1/3 cannot be respresented exactly in decimal). Hence any attempt to determine this number is not guaranteed to produce the value expected. One possible way of doing it would be to convert the floating-point value to a string and find the position of the '.'.
Last edited by 2kaud; July 16th, 2013 at 08:06 AM.

13. Elite Member Power Poster
Join Date
Apr 2000
Location
Belgium (Europe)
Posts
4,626

## Re: A question regarding float and double

what you're trying won't work for very big numbers because a float and a double can hold values that far exceed the possible values you can store in an int. ( a number bigger than 2^31 will overflow, and cause an infinite loop).
For a similar reason, it won't work for very small numbers such as 0.000000001, not because of overflow, but because of accumulated errors each time you multiply by 10.

If you can accept that it only needs to function for a limited and predetermined range of float/doubles, then you may be able to get some code working.
So if for example you can accept restrctions like it only needs to work "For any number between 0 and 999 with at most 4 decimal digits after the decimal point", then you can adjust your code to work within those bounds. But without knowing those acceptable restriction it's impossible to even state whether or not your question can be answered. Regardless of if you calculate it "the hard way" or convert it to a string and count characters. The post by 2Kaud (#5) is subject to these as well. Or put another way, if the detour to string solves your answer, then it can equally be solved by a pure mathematical method.

regardless of your soft restrictions, your hard restrictions are going to be that you can't exceed the limitations of the double type, nor can you get any reliable answer if the error tolerance on any intermediate value would exceed your maximum accrued error. This is why float is unsuitable for many things, it simply doesn't have enough precision for anything that requires even the slightest amount of accuracy.

But you haven't answered the real question... What are you actually trying to do ?

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•