1. should I use hash_map, hash_set, hash_... instead of the normal map, set, multimap, etc. ? Are the hash versions supercede versions of the normal ones? Or there are times when I should use e.g. a map instead of a hash map?
2. are there occasions when I must write a hash function? If yes, how do I do that and how do I make a good (i.e. not a stupid) hash function?
3. I've read about these member functions, but I don't understand when or why I should use them: equal_range, lower_bound, upper_bound. well, about lower & upper bounds, I understand that they usually find something if that particular key does not exist - is there more to them?
4. what are the uses of emplace? for the associative containers, for instance, why shouldn't I simply insert a pair?
5.what function should I use to find an element of an associative container? at, find, lower_bound, or what?? is there some difference in speed among them?
1. should I use hash_map, hash_set, hash_... instead of the normal map, set, multimap, etc. ? Are the hash versions supercede versions of the normal ones? Or there are times when I should use e.g. a map instead of a hash map?
If you don't care whether the map/set elements are sorted, you should consider using unordered_set etc. (The hash_set etc names are nonstandard.) Hash tables have faster average insertions and lookups than red/black trees, but slower worst-case times.
2. are there occasions when I must write a hash function? If yes, how do I do that and how do I make a good (i.e. not a stupid) hash function?
The Boost.Hash library provides some good facilities for creating hash functions, eg hash_combine() etc. I don't know how much of that ended up in the standard library. Certainly, the method of defining a hash function appears to be somewhat different between the std unordered containers and the Boost versions.
3. I've read about these member functions, but I don't understand when or why I should use them: equal_range, lower_bound, upper_bound. well, about lower & upper bounds, I understand that they usually find something if that particular key does not exist - is there more to them?
Those are useful for iterating on ranges of values. For instance, if you have a sorted container and you want to iterate over all values in the half-open interval [5,9), you could do
Code:
for (auto iter = container.lower_bound(5); iter != container.upper_bound(9); ++iter)
{
// stuff
}
Note that lower_bound and upper_bound are different only if the argument passed is in the container. So if "9" isn't in the container, lower_bound(9) and upper_bound(9) will behave the same. The difference is that if it is there, then lower_bound(9) will return an iterator to 9, while upper_bound(9) will return an iterator to the next element *past* 9.
equal_range is useful if the upper and lower bounds are the same, eg, you just want all elements with a given key. This can also be used with unordered containers while lb and ub cannot.
4. what are the uses of emplace? for the associative containers, for instance, why shouldn't I simply insert a pair?
emplace is the equivalent of insert for rvalue references. I'm honestly not quite sure why they didn't just overload insert (and push_back rather than adding emplace_back), but that's what they did. The basic idea is you use them if you want to move an element into a container rather than copying it in. (The object must support move semantics for there to be any difference.)
5.what function should I use to find an element of an associative container? at, find, lower_bound, or what?? is there some difference in speed among them?
If you're looking for an exact match, use find(). If you're just trying to get as close as you can, lower_bound may be more appropriate.
sorry, I don't understand the "worst-case times". what is exactly slow about the hash versions?
"Worst-case time" means: "In the worst possible case, the amount of time it takes".
There are methods, structures out there that usually perform fast on most data, but if you give it *just* the right data, you can make it slow down to a crawl. Simplistic binary trees, for example, usually suffer from this.
A (very famous) example is quick_sort. Quick sort is, well, quick, but if you give it just the right data, you can slow it down to a crawl, making a sort that would otherwise take seconds, take centuries. This has been exploited to launch denial of service attacks on various organizations.
Bottom line:
Any operation can take a certain average amount of time, which is somewhere between the best-case time, and the worst-case time.
Is your question related to IO?
Read this C++ FAQ LITE article at parashift by Marshall Cline. In particular points 1-6.
It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.
Bookmarks