CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6
  1. #1
    Join Date
    Jun 2001
    Location
    Orlando, FL
    Posts
    232

    Floating point math question

    The min and max values for IEEE 754 single precision is 1.1754E-38 and 3.4028E+38 respectively.

    This leads one to belive that as long as I stay within the constraints specifed, the program should - for the most part - produce the correct result.

    Referencing the program below (2.0e20 + 1) - 2.0e20, does not produced the correct result (ie 1.00000), however (2.0e6 + 1) - 2.0e6 does. Did some reading and realized that in order to add two floats the exponents must be the same and there's a normalization process that the values go through such that if the difference between the exponents is greater than the number of
    digits of precision, the value of the smaller number will drop to 0 by the time the exponents are the same. The question then becomes how would I avoid 'situations' in my prgroam such that the difference between the exponents IS NOT greater than the number of digits of precision?
    I'm using a 32 bit fixed point processor that has libraries for doing floating point math. For benchmarking purposes I suspect the largest floating point values I could multiply that'll produce the correct result is 3.4028E+38 * 3.4028E+38?

    Thanks for the assistance




    Code:
    #include "stdafx.h"
    #include "stdio.h"
    #include "math.h"
    
    
    int main(int argc, char* argv[])
    {
    
    	float a, b, rel_diff;
    
    //	b = 2.0e7 + 1;      -- doesnt work
    //	a = b - 2.0e7;
    
    //	b = 2.0e8 + 1.0;   -- doesnt work
    //	a = b - 2.0e8;	
    
    	b = 2.0e20 + 1;    // doesnt work
    	a = b - 2.0e20;	
    
    
    //	b = fabs(1.0e20) + 1.0;
    //	a = fabs(b) - fabs(1.0e20);
    
    //	b = 2.0e6 + 1.0e6;   // works
    //	a = b - 2.0e6;	
    
    
    //	rel_diff = (b - a)/ (a + b);  // or MAX(a,b)
    
    	printf("%f \n" , a);
    
    	return 0;
    }

  2. #2
    Join Date
    Jun 2002
    Location
    Germany
    Posts
    1,557

    Check exponent ranges

    mop,

    Here is a simple routine written in plpain C for checking if the exponents of floats are within range. I wrote it quickly so you might have to check for some simple errors if you decide to use this routine. The code sample might give some ideas on how to address this type of problem.

    Sincerely, Chris.

    Code:
    #include <stdio.h>
    #include <limits.h>
    #include <math.h>
    #include <stdlib.h>
    
    #ifndef FLT_DIG
      #define FLT_DIG 6
    #endif
    
    const int check_exp_range(const float* pu, const float* pv)
    {
      char cu[20];
      char cv[20];
    
      sprintf(cu, "%.1e", *pu);
      sprintf(cv, "%.1e", *pv);
    
      int exp_u = atoi(cu + 4);
      int exp_v = atoi(cv + 4);
    
      if(exp_u - exp_v > 0)
      {
        return exp_u - exp_v <=  FLT_DIG;
      }
      else
      {
        return exp_u - exp_v <= -FLT_DIG;
      }
    
    }
    
    int main(int argc, char* argv[])
    {
      float f1, f2;
      int i;
      
      f1 = (float) 1.0;
      f2 = (float) 1.0e-5;
      i  = check_exp_range(&f1, &f2);
    
      f1 = (float) 1.0;
      f2 = (float) 1.0e-10;
      i  = check_exp_range(&f1, &f2);
    
      return 1;
    }
    You're gonna go blind staring into that box all day.

  3. #3
    Join Date
    Sep 2002
    Posts
    1,747
    The standard library numeric_limits exposes (I believe its a function, but it could be a traits constant) epsilon. If you just multiply the larger value by epsilon and find that it is less than the lower value, you can add and subtract the two values to the precision of the larger value. Otherwise, addition and subtraction will be unnoticed.
    */*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/

    "It's hard to believe in something you don't understand." -- the sidhi X-files episode

    galathaea: prankster, fablist, magician, liar

  4. #4
    Join Date
    Apr 1999
    Location
    Altrincham, England
    Posts
    4,470
    mop: you should also check out what the IEEE standard says about the number of signigicant digits. Remember that 32 bits can only represent approximately 4x10^9 distinct values, from which it's obvious that it cannot represent every floating point number within the range specified (that would be impossible anyway, since there are at least aleph-1 real numbers). There are huge gaps in the sequence. For low magnitudes, there is pretty good coverage of the integral numbers, but as the magnitude starts to exceed the number of significant digits, so the representable values get sparser and sparser. I think the sig digs figure for single precision is 7 or so. This means that you can just about distinguish r from r + 1 when r is around 10^6, but at 10^20 there just isn't the resolution to do it - you probably couldn't distinguish between r and r + 10^6 at that sort of magnitude of number.
    Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.
    --
    Sutter and Alexandrescu, C++ Coding Standards

    Programs must be written for people to read, and only incidentally for machines to execute.

    --
    Harold Abelson and Gerald Jay Sussman

    The cheapest, fastest and most reliable components of a computer system are those that aren't there.
    -- Gordon Bell


  5. #5
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,725
    Maybe this will give you some ideas of what is going on
    Code:
    #include <iostream>
    #include <iomanip>
    
    using namespace std;
    
    void test1()
    {
        float a,b;
    
        b = 2.0e+20;
        a = b - 2.0e+20;
    
        cout << setprecision(25) << b << endl;
        cout << setprecision(25) << a << endl;
        cout << endl;
    }
    
    void test2()
    {
        float a,b;
    
        b = 2.0e+20f;
        a = b - 2.0e+20;
    
        cout << setprecision(25) << b << endl;
        cout << setprecision(25) << a << endl;
        cout << endl;
    }
    
    void test3()
    {
        float a,b;
    
        b = 2.0e+20f;
        a = b - 2.0e+20f;
    
        cout << setprecision(25) << b << endl;
        cout << setprecision(25) << a << endl;
        cout << endl;
    }
    
    void test4()
    {
        float a,b;
    
        b = 2.0e+20;
        a = b - 2.0e+20f;
    
        cout << setprecision(25) << b << endl;
        cout << setprecision(25) << a << endl;
        cout << endl;
    }
    
    
    void test5()
    {
        double a,b;
    
        b = 2.0e+20;
        a = b - 2.0e+20;
    
        cout << setprecision(25) << b << endl;
        cout << setprecision(25) << a << endl;
        cout << endl;
    }
    
    
    int main(int argc, char* argv[])
    {
        test1();
        test2();
        test3();
        test4();
        test5();
    
        return 0;
    }

  6. #6
    Join Date
    Aug 2002
    Posts
    78

    Analog

    A simple analog in decimal is that the floating point number system only maintains the X most significant digits plus an exponent.

    As long as the number adding or subtracting is within those significant digits, the arithmetic will "take". Otherwise it won't.

    To see why, consider: If the floating representation can handle a "ones" digit plus 2 decimal digits (X = 3), then:


    67,284 turns into 6.72e4.

    Note the "84" gets lost since the internal representation can only handle the "ones" plus 2 decimal digits.

    If you try to add 1 (or even 99) to it, well, 1 in e4 representation is 0.00e4, so you actually add (or subtract) 0, and thus the number doesn't change. The exponents do NOT have to be the same logically (though they may, perhaps, in the internal implementation of the chip) -- you're just trying to add or subtract something that is too small, like adding a bacteria to a whale that is on a scale that measures only to the nearest 100 pounds accuracy.

    In your example, 1 is too small to be seen by an e20 number (whale + bacteria on industrial scale), while 1 is big enough to be seen by an e6 number (dust mite + bacteria on a sensitive scale.)
    Last edited by Gorgor; March 26th, 2003 at 02:49 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured