Hi,
It is reasonable to use floating point number as a key in std::map. Would it fail while trying to lookup for a keymap?
Thanks
SB
Printable View
Hi,
It is reasonable to use floating point number as a key in std::map. Would it fail while trying to lookup for a keymap?
Thanks
SB
Depends how you want to use it. In any case, if you are afraid of rouding errors, you could write a comparison predicate that uses a certain precision to compare two floating point numbers. Then you can pass that predicate to the map.
I never consider that possibility, but doesn't sound like a good idea to me.
I'm a bit confused with the working of map versus hash_map. For the hash_map the "Equal" is used to find the item, whereas the map seems to take "LessThan" as a template argument. Is that sufficient for finding an item?
Thanks
SB
I need to use it for find. In fact, I am debugging someone's code and find this usage, and am a bit skeptical of it. So, I wrote a small test application, which seems to work quite fine. I don't see any issues. But is there anything that I am missing and can be disasterous?
Code:using namespace std;
map<double,int> m;
vector<double> v;
int i;
for( i = 0; i < 10; i++ ) {
double d = (double) rand() / RAND_MAX * M_PI;
d = exp( d );
v.push_back(d);
cout << (v[i]-d) << endl;
m.insert(make_pair(d,i));
}
for( i = 0; i < v.size(); i++ ) {
//cout << m[v[i]] << endl;
map<double,int>::iterator it = m.find(v[i]);
cout << (v[i]-it->first) << endl;
}
for( map<double,int>::iterator it = m.begin(); it != m.end(); ++it ) {
//cout << (it->second-it->second) << endl;
}
I think you would run into problems with that, as your predicate would not be transitive. For example, according to your predicate, 0.3 is equivalent to 0.4 and 0.4 is equivalent to 0.5, but 0.3 would not be equivalent to 0.5. While I don't think that is too much of a problem with a standard map, it might wreak havoc on a multimap.
I think it would be more advisable to use a map with a classic predicate, and implement a "lax_find" and "lax_insert"method that searches for your keys using lower_bound. At least, you know exactly what you are doing, and not subverting the underlying container.
PS. I could not find the exact requirements on map predicates.
I think I don't quite follow this. First, for std::map, I need to define LessThan() not EqualTo(). Correct? (this behavior is opposite to the hash_map, which I normally use) How this translates to finding the item is beyond me. My understanding is that for each key, an "internal key" is generated, which is then used as the key in the hashtable. For hash_map, one can provide a hashing function, which generates such an "internal key" that is as unique as possible. But even is two numbers turn out to have similar "internal keys" there is a way to resolve the issue. So in the end, is this LessThan required to resolve the ambiguity of similar keys? But how this can avoid floating point errors?
Now here for std::map, suppose I have a key "K" which is a float. I add an item, (K,I) in the map. Why it won't be able to find this item later when I search? Is it possible to have
(K - K) not equal to 0?
Thanks
SB
Good point. You'd need to set the epsilon value such that this situation won't occur.
However, I don't think that using the default predicate is possible in general, since IIRC it is not guaranteed that a comparison using floats will yield the same result every time. If the value is moved to a register or memory, it can be truncated/rounded, yielding a different result of a comparison.
Note that the problems comparing floating point values hold in general. Your specific compiler may have an option to avoid at least some of these errors. E.g. in VS you have a compiler option that will avoid the problems with floating point values that linger in the processor. This would make it possible to use a std::map<float, int> as long as you don't rely on the result of a computation being equal to the result of a different computation.
What I mean is that this can be guaranteed using this compiler option:
whereas it cannot be guaranteed in general.Code:float x = 0.1 + 0.1;
assert(x == 0.1 + 0.1); // works in VS with /fp:precise
However, this still isn't guaranteed
Code:float x = 0.1 + 0.1;
assert(x == 0.2); // might fail
But if I don't add or subtract anything, it should work? For example,
1. an array initialized with all possible "floating point key values".
2. map initialized with (key,item) where keys are used from the array in (1).
3. use elements of array in (1) to find different keyitems in the map
Please correct me if I'm wrong?
The bottom line is that floating point arithmetic isn't exact. Worst yet, not only is it not exact, but it goes against common assumptions:
[29.16] Why is floating point so inaccurate? Why doesn't this print 0.43?
[29.17] Why doesn't my floating-point comparison work?
[29.18] Why is cos(x) != cos(y) even though x == y? (Or sine or tangent or log or just about any other floating point computation)
Once you read this, you should understand all the pitfalls with floating point calculations, and why it is so difficult to rely on them.
So to answer your question to the best i can: Maybe it will work, maybe it won't. I'm sorry.
This is why I recommend you try an approach where you can guarantee your searches. For example, a ranged search that can give all the keys between [target-epsilon, target+epsilon]. This way, if your key is almost equal, but not exactly equal, you will still find it. With this approach, at least, you don't have to worry about all that dangerous floating point crap*.
*Almost.
How many compilers did you test this on? How many versions of the compiler you're using did you test this on? What compiler options did you test this with? What values did you use?
The bottom line, as monarch_dodra mentioned, is that your program is basically ill-formed in terms of consistency. You are using a double in a "raw" fashion to do lookups into a map. Since the ordering of the map depends on the ordering of the keys, the values of those keys, if they're computed can be different if you changed compilers, versions, CRT, settings, etc. So your program will have different behaviour using the same code and the same input data.
Even if you run this on all the compilers in the shop, you can't guarantee it works 100% of the time consistently, with the reason being using the double as the key in the map.
This is even a bigger issue than the map.Quote:
Why it won't be able to find this item later when I search? Is it possible to have
(K - K) not equal to 0?
What if the second K in that equation is computed, and the hand calculation shows what you should have is K - K, but the computer calcuation shows that the first K is not exactly equal to the second K, maybe off by 0.00000001? Your lookup fails even though algebraically it should have worked.
Floating point math isn't the same as algebra or school-book math. You can never assume that what you do by hand using decimal point values works with exact precision with binary floating point. Even the most simplest things aren't what they seem:
How many times is this loop executed? The round-off error by adding 0.001 each time could throw the number of times that loop executes off by 1.Code:for (double i = 0.0; i <= 1.0; i += 0.001)
{
}
Regards,
Paul McKenzie
The LessThan is only used to *order* the elements in a map because the std::map is an ordered container. It's not used to *find* an element which might cause problems with floating point values.
C++0x includes new containers called *unordered associative* containers and these don't order their elements.
If you are talking about the "LessThan" comparison with respect to the current standard associative containers, then that is not true: equivalent keys, which is what the find() member function looks for, is defined in terms of the "LessThan" comparison.Quote:
Originally Posted by Marc G
[ merged threads ]
Don't start multiple threads on the same issue.
QUESTION 1: I think I am missing something here. I am not sure how "map" will use "LessThan" for "find". Can you please explain, or point to a link?
QUESTION 2: How to define a "LessThan" with an epsilon?
QUESTION 3: I know the general issues with "floats" but in my case if all keys are stored in a "vector", and then "map" is created from these keys, and finally the "vector" is used to search for items in the "map", why it will fail. The point is that no matter what the compiler or the compiler settings are, once a float e.g. 0.43 is stored in an array (e.g. as 0.430000001), it will remain 0.430000001. So instead of 0.43 as being a key, the key would be "0.430000001". Then the question is why it would be problematic, more problematic using an epsilon tolerance?
Two items are considered equal if a == b is true. This is the test used by std::find().
Two items are considered equivalent if !(a < b) && !(b < a). This is the test used by std::map::find().
I don't think it's possible, for the above-mentioned reasons.Quote:
QUESTION 2: How to define a "LessThan" with an epsilon?
In that case it should be fine.....but it creates a "delicate" situation in which a later change by another coder who doesn't understand this subtlety could bring the whole thing crashing down. That's the sort of situation it's better to avoid if you can.Quote:
QUESTION 3: I know the general issues with "floats" but in my case if all keys are stored in a "vector", and then "map" is created from these keys, and finally the "vector" is used to search for items in the "map", why it will fail. The point is that no matter what the compiler or the compiler settings are, once a float e.g. 0.43 is stored in an array (e.g. as 0.430000001), it will remain 0.430000001. So instead of 0.43 as being a key, the key would be "0.430000001". Then the question is why it would be problematic, more problematic using an epsilon tolerance?
Thanks Lindley for the clarification. In this case, I'm "another code" trying to debug some of the inherited code, without crashing it down.
The situation is that the keys instead of floats is the coordinates of a point in 3D, i.e., the following structure:
Given the delicate balance described above, this should work as a "consistent" LessThan function.Code:class CPoint3D
{
public:
float x, y, z;
public:
static bool operator< (const CPoint3D& a, const CPoint3D& b)
{
if ( a.x < b.x )
return true;
else if ( a.x > b.x )
return false;
else if ( a.y < b.y )
return true;
else if ( a.y > b.y )
return false;
else if ( a.z < b.z )
return true;
else if ( a.z > b.z )
return true;
else
return false;
}
};
I think one of those return statements might be wrong.
How do you access the keys in the vector? What operations do you do to say "get me my value in the vector" and I will place it in the map"? Do not assume that this is even safe to do with floating point values.
To access the float, possibly a floating point CPU instruction is done which could cause a slight modification of the number, and that thing you never expected to happen just happens. Floating point trouble not only happens during calculation -- passing them by value, assignment, etc. could lead to these issues.
Regards,
Paul McKenzie
Thanks Lindley for pointing out a problem.
Paul, you are suggesting a serious flaw in this implementation. I thought that when a value is fetched into an internal register, it is just mirrored (a binary mirror) so that it won't change. Another question is that would this "problem" be more pronounced if it is compiled for 32-bit machines, but run on 64-bit machine. I am not sure, I know how it will handle the internal registers in such case. Or the behavior will be similar on either machines.
Thanks,
SB
What I was thinking was something like this:
output:Code:#include <string>
#include <map>
#include <iostream>
static const double epsilon = 0.05;
std::map<double, std::string>::const_iterator find_approximatly(const std::map<double, std::string>& i_map, double i_key)
{
std::map<double, std::string>::const_iterator it_low = i_map.lower_bound(i_key-epsilon);
if(it_low == i_map.end())
{return i_map.end();}
const double found_value = it_low->first;
if (found_value > i_key+epsilon)
{return i_map.end();}
return it_low;
}
int main()
{
std::map<double, std::string> a_map;
a_map[0.1] = "0.1";
//a_map[0.1] = "0.1";
a_map[0.49] = "0.49";
{
double first_key = 0.1;
std::cout << "searching: " << first_key;
std::map<double, std::string>::const_iterator first_found = find_approximatly(a_map, first_key);
if (first_found != a_map.end())
{std::cout << " found at: " << first_found->second << " actual double value: " << first_found->first << std::endl;}
else
{std::cout << " not found " << std::endl;}
}
{
double second_key = 0.3;
std::cout << "searching: " << second_key;
std::map<double, std::string>::const_iterator second_found = find_approximatly(a_map, second_key);
if (second_found != a_map.end())
{std::cout << " found at: " << second_found->second << " actual double value: " << second_found->first << std::endl;}
else
{std::cout << " not found " << std::endl;}
}
{
double third_key = 0.5;
std::cout << "searching: " << third_key;
std::map<double, std::string>::const_iterator third_found = find_approximatly(a_map, third_key);
if (third_found != a_map.end())
{std::cout << " found at: " << third_found->second << " actual double value: " << third_found->first << std::endl;}
else
{std::cout << " not found " << std::endl;}
}
}
Code:searching: 0.1 found at: 0.1 actual double value: 0.1
searching: 0.3 not found
searching: 0.5 found at: 0.49 actual double value: 0.49
But wouldn't a < with an epsilon tolerance be rather some sort of <=, simply violating the strict weak ordering rule? :ehh:
Just my $.02...
Well, the point would be that the container still uses the strict operator<. However, whenever you as the user are looking for a specific key, you can use approximate_search, to have a broader search. approximate_search still use a strict ordering.
The basic idea is to make a basic wrapper around your container, but you avoid subverting it by giving an odd "less than or just a little more" operator.
My bottom line is that writing such code has inherent faults. If it's not calculation, it will be something else unexpected.
One question -- how are you going to debug such a program if a customer (if you have customers) reports a problem (wrong values being used, for example)? I would take a big gulp and prepare for sleepless nights.
Regards,
Paul McKenzie
Usually, when dealing with coordinates like this, you compare the length of the vector.
http://www.math.montana.edu/frankw/c...tude/refer.htm
Both have their advantages and disadvantages.
The advantage of using coordinates is that it results in a unique ordering. The disadvantage to it is that the order of your nodes can change drastically if one of them is moved even just a little bit.
The advantage of using vector length is that it uses an actual mathematical/physical value, which is the Euclidean Norm (although almost any 3D norm will do). An advantage to doing this is that for starters, it makes sense by looking at a graph which nodes are order before what other nodes, and the ordering is more robust to tiny changes in point positions.
I strongly want support what Paul has said. Floating point numbers are not suitable for being used as keys. The 'same' mathematical value can be represented with different bits and cannot be found anymore in the map, e. g. value 12 can be represented as 12.000000000... or as 11.9999999999... (even in math) depending on statements like x = 12. or x = sqrt(144); .
For the 3D coordinates you probably have equidistant values (if not, how would you ever expect to find a point again?). Then, you easily can substitute the floats by integers by counting the steps, thus avoiding any of the issues discussed here.
For the less function I wouldn't use a length-based approach. If you ever want to iterate thru all points of that mapping you probably would value the fact when the points were neatly ordered by their coordinate values. Moreover, by using the length you could get unique keys for different points what would spoil your map. And a multimap surely make no sense.
Well, you could always make vector length the "primary" key and resolve equality using theta (polar coordinates).
The bottom line to all of this is by whatever means necessary, by hook or by crook, get out of using floating point values for lookups, counters, etc. and use something consistent that will work (if given the same data) all the time, on every compiler, no matter what compiler options are selected. I had a link to a PDF document, where a complex problem was described that was initially coded using floating point values, and it went through the steps of how they eventually solved the problem using integral values. The solution was not obvious, but it was possible. Wish I had it with me, but unfortunately, I don't have the URL right now.
I'm sure that many code reviewers would reject outright any usage of doubles as keys, unless that double can be normalized (and consistently normalized) to a proper integral key or value. How the normalization is done is what we're being paid for (if we are being paid to do it) -- it's our job to figure these things out. It may be easy or difficult, but it has to be done and honestly, should have been part of the design phase from the start.
I mention the debugging, because I can't imagine myself sitting there with an app that uses floating point this way, and then get a phone call or an email from customer support saying that they have someone who is getting strange and/or inconsistent results. Then what? It's either going to be some sort of patch fix (once you find out what the exact data the person is using, and that may take a while figuring that out), or you have to scrap the whole idea and go back to rethinking how to accomplish what you're trying to do.
Regards,
Paul McKenzie
The Bresenham line algorithm is a good example of replacing floating point calculations with integers.
http://en.wikipedia.org/wiki/Bresenh...line_algorithm