The STL containers aren't constrained into doing the default memory management. I don't see why a custom allocator couldn't be written for std::list if "reserved memory" were an issue.
Regards,
Paul McKenzie
Printable View
The STL containers aren't constrained into doing the default memory management. I don't see why a custom allocator couldn't be written for std::list if "reserved memory" were an issue.
Regards,
Paul McKenzie
Refering to autoexp.dat is good but better would be if there was some generic list of things that can be added to it to make STL debugging more easier. If some one can post the entries they made to autoexp.dat that would be much appriciated.
Visual Studio .NET 2003 added the following:
std::map<*>=size=<_Mysize>
std::vector<*>= first=<_First> last=<_Last>
std::list<*> =size=<_Mysize>
std::multimap<*>=size=<_Mysize>
std::set<*> =size=<_Mysize>
std::deque<*> =size=<_Mysize>
std::queue<*> =size=<c._Mysize>
std::priority_queue<*>= first=<c._First> last=<c._Last>
std::stack<*> =size=<c._Mysize>
std::binder1st<*>= op=<op> value=<value>
std::binder2nd<*>= op=<op> value=<value>
std::pair<*> =first=<first> second=<second>
std::less<*>=lessthan
std::greater<*>=greaterthan
I believe most of these will work for VS 6.0, but I have not checked them all.
oktronic,
although this discussion is somewhat running on the spot, I'll try to answer once more. The problem I (and I think others as well) have with your kind of argumentation that you hardly ever get to the point which is actually being discussed. You keep saying that others are wrong, whithout further explaining why.
What about the different levels of abstraction you are deliberately intermixing? Any comment on that? What about the fact that the links you posted are about low level memory management issues at the hardware / OS level, but have nothing to do with algorithms and their complexity?
Until now, you still didn't make your point very clear. You started out by claiming to have a self-implemented vector template class which can be used as a drop-in compatible replacement for std::vector, yet being more readable in code, and performing better than std::vector and CArray. However, what you showed us was the implementation of a linked list (rather to be compared to std::list than std::vector), so your benchmark, which used tail insertion, was completely meaningless. Furthermore, it has been pointed out that your class is not compatible with std::vector for several reasons. A more meaningful benchmark test conducted by Kevin showed that your class performed somewhat near to std::list (which was to be expected), and was about 10 times slower than std::vector with a pre-allocated size (which is also within the normal range of expectation).
However, you completely ignored those facts, only stating that std::vector is also a linked list implementation (which has been proven wrong), and that your class would perform the same as std::vector if only pre-allocated memory was used. Now, pre-allocating memory makes sense for an array, but not for a linked list, and it remained completely unclear how memory should be "pre-allocated" for the class you posted so far.
Instead, you diverted the discussion to how virtual memory is handled at the hardware / OS level, something completely unrelated to the original subject. You posted two links to what you called "documentation", but there were only slides meant to accompany a basic class on OS design. You still didn't comment on that, leaving everyone with the impression that you fail to perceive the fundamental difference between virtual memory management at the hardware level and data structures at the software level.
I apologize if I sound a little harsh now ;), but most of the statements you made so far (and also some technical terms used in a slightly out-of-place manner) have the smell of hot air, and leave the impression that you have not the slightest of what you are talking about.
So your posts to this thread so far leave everyone puzzled: Did you just want to add some noise, or do you really have valid arguments to contribute? If so, you should rather lay them out clearly, instead of just making doubtful claims.
Oh, and yes:
Yes, you were, as your original statement leading to this quote was "can you explain to me why using my own list is more inefficient then iterating through system memory list to the 50th element?". Here's why: "my own list" in this context means your linked list implementation, and "iterating through system memory list" was the way you claimed to understand how array elements were accessed.Quote:
Originally posted by oktronic
I wasn't comparing it to a linked list.
What do you mean by "set that up for him"? You mean you intentionally made a wrong statement as sort of a bait, to have him pick it up? My impression is that Kevin understood very well what he's talking about, while your preceding statement about adding a value to a pointer and dereferencing it showed, once again, a severe misconception about the entire subject.Quote:
Originally posted by oktronic
Exactly, I was hoping Kevin picked that up, that's why I set that up for him since he didn't seem to understand.
No, no, again. The point about arrays is that you can randomly access any element in constant time, not just iterate through it. The cost of iterating an array vs. iterating a list is always linear, and only differs in details. It might be a little faster or a little slower, depending on the actual implementation. Complexity comes in only when you have to access an arbitrary element.Quote:
Originally posted by oktronic
That's one of the differences between array iteration and ptr. Its 1 operation to get array address, one addition to new address, 1 operation to get value from address.
However for a simple de-reference, its 1 to get address, 1 to get value. Which is one less operation, so its faster. Which is the point for pre-allocated memory.
As already said above: pre-allocating memory makes sense for an array, but not for a linked list. Kevin already had to make guesses and anticipate your linked-list code in order to run his benchmark. How would you want him to implement something as a pre-allocation scheme (which is meaningless for a linked list) for your class?Quote:
Originally posted by oktronic
Twice now I have asked you to simply modify the code Kevin provided and pre-allocate memory for it. So please do so and you will see the results for yourself making this argument moot.
Thank you. The second "document" you mention is a slide show which reminds me of the material presented at my undergrad OS design class 15 years ago. Nice slides, but, as said above, completely unrelated to the discussion in this thread. The thread's original subject was somewhat like "I hate STL, I prefer MFC containers", and, as expected, led to heavy debates on that topic. Your contribution first sounded like "I roll my own STL replacement, which is much better", but then you failed to show how and why, and the discussion diverted into complexity considerations for arrays vs. linked lists, as you seemed (and still seem) to completely ignore that important point. Instead, you brought up the links to the OS design slides, which, I say it again, have no relevance to the original topic. That maneuver just showed your concepts about memory, addresses, pointers etc. to be rather diffuse.Quote:
Originally posted by oktronic Actually, Since you seem to have a interesting understanding of the concepts, you should read the second document provided. Its more advanced then the first, but I think you should be able to understand how a page system works. You made several incorrect points about the system, but I think you're able to understand them so I would suggest picking up an advanced operating design book next time you're out.
Wow, impressing. Look, I have been living in Verona and heard Placido Domingo at the Arena, but that doesn't make me a tenor. I've met Marvin Minsky and Joseph Weizenbaum, but that doesn't make me an AI philosopher. Reading their works, however, can give you a slight idea. So if you've really read Knuth, are you sure you also understood what you read?Quote:
Originally posted by oktronic
Read them all, got them in my library. Meet Knuth here at Stanford. I should mention I live in the Bay Area and have done some work there.
So if you've got them in your library, just pick up Vol. 1, "Fundamental Algorithms" and turn to page 244. Read chapter 2.2.2, "Sequential Allocation" and the following chapter 2.2.3, "Linked Allocation". After that, come back and tell us if you still support the statemets you made above about arrays and linked lists.Quote:
Originally posted by oktronic
What kind of bothers me a little, is you quote his book and yet I don't think you know what's in them. So I will suggest the same thing to you, read them and then quote them. But please don't just quote any random books to try to make a point. If you have specific points you feel these books illustrate, then quote the points at least then I might get to see that you understand the point you are making.
Yes, OS design is fun, but again, as said above, is not related to the subject here. As to the pre-allocated memory, see above.Quote:
Originally posted by oktronic
While you are about 60% right in your post, there's a lot of holes and I could go on about this for months, operating system design can be a lot of fun and a lot of irritation, but in the interest of this thread, please just add pre-allocated memory to Kevin's class and run your benchmark. If you have any problems with that let me know and I'll explain.
Note that I'm not "telling everybody that they are wrong", but that several posters here are having a hard time trying to convince you that you're actually missing the point. I think I proved my points more than enough. Just read what I wrote and go into it instead of just ignoring the key arguments.Quote:
Originally posted by oktronic
But please, less telling everybody that they are wrong and more time proving your points.
I think I sufficiently did so... ;)Quote:
Originally posted by oktronic
I have handed you the ability to prove me wrong, use it.
OK, I just noticed that my last post floats more or less on the flame war level, and contributes little to the original STL discussion (which has been carried on in the meantime). My apologies, but I didn't want to leave that branch of the discussion completely uncommented.
gstercken
I'll ask you to please keep your comments on subject. You will find a character debate will not lead you anywhere you want to go.
If you don't understand any of the material, then please ask.
I actually had a long drawn out post to point out every post that you seem to have missed to show your post to be silly and unfounded, but I'm not going to play that game twice.
You feel your right. Good for you.
But is your position so weak that your only recourse to try to make yourself look big, by talking big? Maybe you feel if you confuse me, and the issue, you might actually make yourself right? hmmm...
The current issue was whether or not STL is more efficient then anything else, I believe this is where we were.
Its already been proven that its badly designed, badly written, badly documented, hard to debug (except for modifying the debug files as proposed, which does help) and even for compact code, it could actually be more compact.
So I'm still waiting for you to benchmark a linked-list with reserved memory to see if vector using "contiguous" memory is so much faster, or at least see a argument for this,
I believe that's where we were.
There is no point to starting this debate all over with the debugging issues again.
oktronic, I thought you were going to provide us with the benchmark tests!Quote:
Originally posted by oktronic
So I'm still waiting for you to benchmark a linked-list with reserved memory to see if vector using "contiguous" memory is so much faster, or at least see a argument for this,
I believe that's where we were.
Have you rescinded your offer?Quote:
Originally posted by oktronic
If you want, and trust me to do so fairly, you can give me any tests you would like and I'll give you the results.
I believe most modern versions of the STL are well designed and well written.Quote:
Originally posted by oktronic
Its already been proven that its badly designed, badly written, badly documented, hard to debug (except for modifying the debug files as proposed, which does help) and even for compact code, it could actually be more compact.
Documentation and ease of debugging really depend on the who wrote the documentation and who wrote the debugger.
As far as being "compact" -- well, that's a trade off for any application -- do you optimize for speed or optimize of smaller (more compact) code size?
To say the true, I don't understand the points that it is bad documented and hard to debug. All you need to know about STL is how to use containers, algorithms, functors etc, these are documented too much. Why do you need to deep into STL code with debugger? To find a bug in STL?
Any example where my comments were not on subject?Quote:
Originally posted by oktronic
I'll ask you to please keep your comments on subject.
What material?Quote:
Originally posted by oktronic
If you don't understand any of the material, then please ask.
Confusion... Does this mean you finally understand that you are completely confused about the subject?Quote:
Originally posted by oktronic
But is your position so weak that your only recourse to try to make yourself look big, by talking big? Maybe you feel if you confuse me, and the issue, you might actually make yourself right? hmmm...
Has this really been proven? Where? And by whom? I don't see any evidence of that. What are "debug files"?Quote:
Originally posted by oktronic
The current issue was whether or not STL is more efficient then anything else, I believe this is where we were.
Its already been proven that its badly designed, badly written, badly documented, hard to debug (except for modifying the debug files as proposed, which does help) and even for compact code, it could actually be more compact.
What the heck is "a linked-list with reserved memory"? I think I pointed out more than once than this is completely pointless. You still failed to come up with an explanation about how you would reserve memory for a linked list, like you failed to comment on any other of the key arguments.Quote:
Originally posted by oktronic
So I'm still waiting for you to benchmark a linked-list with reserved memory to see if vector using "contiguous" memory is so much faster, or at least see a argument for this,
I believe that's where we were.
Debugging issues?Quote:
Originally posted by oktronic
There is no point to starting this debate all over with the debugging issues again.
I'm hoping you will do the benchmark yourself so you can see where optimization will come into play, but the stats are, using the same test except that I kicked up the count to 5 Million in the loop to increase resolution, at 1 mil they were both at less then 20ms.
vector = 120ms av.
linked-list = 160ms av.
It was expected that the list would be alittle slower since the
it does assign the pointers, and that's going to take a little time.
But I think it demonstrates how the system optimizations are a negative for the vector, since 40 ms spread isn't that much of a bonus for 5 mil inserts.
This was using my own, but I'll write up another to post that does this and when you show me yours, I'll show you mine. I'm interested in your solution.
:D :D :DQuote:
Originally posted by oktronic
and when you show me yours, I'll show you mine.
gstercken,
I'm not going to get into another flame war, please act like an adult or move on, if you really require a detailed response to your flame, then let me know where you want it, but keep this thread on topic.
Your last post has shown you to be nothing more then clueless. There are 200 odd posts in this thread all about the pros and cons of STL. If you have no idea what I'm talking about or anybody else, then you haven't been reading them, in which case, anything you say is uninformed and a waste of everybody's time.
But since you had so many questions, I'll answer them in brief.
[QUOTE]Any example where my comments were not on subject?[\QUOTE]
read your post again, find the subject.
[QUOTE]What material??[\QUOTE]
Besides what I posted, how about the other 200 odd posts.
[QUOTE]Confusion... Does this mean you finally understand that you are completely confused about the subject??[\QUOTE]
Didn't say that, perhaps you're confused? Sounds it.
[QUOTE]Has this really been proven? Where? And by whom? I don't see any evidence of that. What are "debug files"?
[\QUOTE]
All 4 answers you seek are in all these nice posts that people spent time writing for your pleasure.
[QUOTE]What the heck is "a linked-list with reserved memory"?[\QUOTE]
Just a guess, but I would say that its a linked list that has reserved memory....
[QUOTE]I think I pointed out more than once than this is completely pointless.[\QUOTE]
You saying this makes it true how exactly? And how could you point that out when you just asked what it was?
[QUOTE]Debugging issues?[\QUOTE]
Yes, as in Debugging issues.
Hope that helps you.
I see... Your point of view gets clearer and clearer with every post... And dadaism is often used as a last attempt to get out of an uncomfortable discussion... ;)
However, I hoped all the time that you would finally come up with a serious and valid argument to this discussion. Your last post is the definite proof that you painted yourself into a corner, and that you are unwilling and/or unable to seriously defend your original claim, which crumbled into dust by now.
The only thing you do by continuing to flame and by not reading the posts, or adding anything constructive or accurate to this thread, is undermine your own credibility.
I'll ask you again to either act like an adult or move on.
The next phase to this thread is for actual performance tests. I submitted mine for the purpose of moving this thread forward to something more constuctive then, "I'm right, your wrong, so there". If this is to complex for you, then I'm sure there are other threads you can post in.
:eek: So your solution is actually slower than std::vector? :eek:Quote:
Originally posted by oktronic
I'm hoping you will do the benchmark yourself so you can see where optimization will come into play, but the stats are, using the same test except that I kicked up the count to 5 Million in the loop to increase resolution, at 1 mil they were both at less then 20ms.
vector = 120ms av.
linked-list = 160ms av.
It was expected that the list would be alittle slower since the
it does assign the pointers, and that's going to take a little time.
But I think it demonstrates how the system optimizations are a negative for the vector, since 40 ms spread isn't that much of a bonus for 5 mil inserts.
I already did this many posts ago. I posted the code, the compiler, the fact I used the release configuration, and the results.Quote:
Originally posted by oktronic
This was using my own, but I'll write up another to post that does this and when you show me yours, I'll show you mine. I'm interested in your solution.
Hi Kevin,
Yes, I expected it to be slower since it does have more operations then vector does when both are pre-allocated.
I'm interested though, in you "pre-allocating" memory for your previous linked-list that you posted when you get a chance, so we can compare the benchmarks against optimization techniques.
If you did already posted that, if you could point me in the direction, been sifting though lots of posts as of late.
Ok, sarcasm aside. I believe what oktronic is suggesting is that his class can pre-allocate memory in much the same way std::vector can. In this sense, he will be avoiding the penalties of continually calling new. Fine. This may be a good design in some cases. I have not had the need to override the standard allocator yet, but I believe that a similar approach can be used with std::list by overriding the standard allocator (someone correct me if I am wrong). Then I would expect similar performance from both std::list and oktronic's code. Like oktronic's tests showed and he himself noted, std::vector is still faster if pre-allocated memory is used. This is due (in part at least) to the fact that std::vector does not have to assign pointers used for a linked list.
All that being said, there are two main differences (perhaps more -- can't verify since oktronic cannot release his code)between oktronic's class and std::vector:
(1) oktronic's class uses different error handling mechanisms
(2) oktronic's class is not a true "drop-in" for std::vector, because of the order (constant-time, linear, O(log(N)), ...) of certain operations. It is really more akin to std::list (which using a non-standard allocator can likely have performance on par with oktronic's class).
A few notes about the tests we have performed:
- my tests were performed without the real code behind oktronic's class
- oktronic's tests may (maybe not) be skewed to emphasize the strengths of his class
- As Paul or OReubens pointed out earlier, these tests are likely not real-world tests anyway.
All guesses as to what oktronic's class exactly does, or how it performs is unknown and further discussion will not be productive. We know that:
* oktronic's class is a linked list. And linked lists have some inherent traits (O(N) time for finding the nth element for example)
* oktronic's class is slower than std::vector and likely faster than std::list (with a standard allocator) for the over-simplified tests given. Most people might liken this to "comparing apples and oranges" though.
I believe we have be quite hard on oktronic (some of that probably warranted, some probably not). I do believe that his claims have been overly bold, that he has done everything to hold his position on his prize piece of code. Our poking and prodding has only served to build his defensiveness. oktronic, in the future chill-out a little bit. I can vouch that most if not all of the people who confronted you really want to get to the truth, are interested in helping others, and interested in learning more -- just take a look at their posts in the past. In the past, I have come up with some insight and solutions people have found very valuable. I have also been corrected many times. So be it! That's part of me growing and learning as a programmer.
I think that's enough on this topic.
No, I didn't do that test, b/c until now I don't think I understood what you were trying to do. And I don't feel like spending anymore time on this topic. Sorry! :( After thinking about what you claim you are doing, I don't question your results. I still don't think that your class and std::vector are fair comparisons, because all other operations will vary greatly in their performance. std::vector will be much slower trying to insert elements in the middle of the data; your class (and any linked list for that matter) will be slower trying to randomly access elements (not iterating through elements). This is because of the very nature of linked lists and the very nature of arrays. My only question would be how well std::list would perform using a non-standard allocator. Even then, it would not be fair to compare against your class because of the different error handling -- which is absolutely fine -- it just isn't the way many programmers want to handle errors.Quote:
Originally posted by oktronic
I'm interested though, in you "pre-allocating" memory for your previous linked-list that you posted when you get a chance, so we can compare the benchmarks against optimization techniques.
If you did already posted that, if you could point me in the direction, been sifting though lots of posts as of late.
- Kevin
Ok, Kevin,
This bothers me.
I don't recall making any bold statements. I told you that my class can perform as fast if not faster then vector, but I asked you to create your own and compare. I could have used list, but I did that on purpose. So my class isn't in question here. It goes far beyond STL. As for "drop-in", this is a terminology issue. "drop-in" means will compile in place of. "in-place" code must adhere to the code its replacing. I believe I stated much earlier that my class is different.
Besides all that though, I wasn't asking you to compare your code to my class.
What I made you do was write the class. A little bold, but not in statement.
It was said C++ contiguous memory is faster then dynamic. I told you that the system hit is minimal and explained with example. Nothing bold yet.
You don't believe me, so I challenged you to simply modify your linked-list class and compare.
This is what I don't get. All you have to do is add 2 functions for allocation calls and 2 pointers and 2 counts. took me 5 minutes to modify your code and run the tests, I did the pre-alloc in the constructor since all I want to test in this case is the insert. I actually had the time to add the checks in so the class could grow as a linked list when its limit was full. Nothing complex here.
I wanted you to write this so you can see, for yourself, how your class and the system interact. This way there is no argument. Simple fact. You can hardly argue with your own eyes.
I haven't made any claims that my class is some form of magical class that will solve the world's problems. I specifically encouraged you to do the work yourself so you would learn.
Yet, for some reason since the beginning of time, people who like to argue, rarely like to learn and even in the face of their own findings will refuse to accept theory as fact.
All you would have to do is try the very assumption you are proposing and see the results for yourself, no bold statements, no magic pills, just old fashioned trial and error programming.
So this does bother me. Lots of people quoting, lots of flames, lots of accusations, yet nobody seems to want to actually try the very thing they are talking about. I've been asking for 2 days now, so imagine my disappointment.
You own a computer, you have a compiler, you assumingly know how to program, so I say do the work and learn. If this is a bold statement, then color me bold.
Wow! That was the most concilliatory message you have received in this forum. It was not a flame-war message by any means.Quote:
Originally posted by oktronic
Ok, Kevin,
This bothers me.
Didn't I say that I could understand how you got the results you did? And all I said was I didn't want to be on this topic anymore. Nothing about what you said being complex.Quote:
Originally posted by oktronic
This is what I don't get. All you have to do is add 2 functions for allocation calls and 2 pointers and 2 counts. took me 5 minutes to modify your code and run the tests, I did the pre-alloc in the constructor since all I want to test in this case is the insert. I actually had the time to add the checks in so the class could grow as a linked list when its limit was full. Nothing complex here.
I wanted you to write this so you can see, for yourself, how your class and the system interact.
I didn't understand what you were trying to do until tonight. You refused to give me code when I asked (ok, maybe it considered IP for the company you work for). You didn't even post pseudo-code though. People cannot try an idea until they understand. And, I'm talking with respect to the idea of pre-allocation here, not the linked-list snippet.Quote:
Originally posted by oktronic
yet nobody seems to want to actually try the very thing they are talking about. I've been asking for 2 days now, so imagine my disappointment.
Now this is demeaning!!! :mad: I've seen many people say you are mistaken, but I don't recall (and maybe I am wrong) anyone saying that there was doubt you could program. People stated doubt about your understanding of the topic in question, but not about ability to program. Anyway, a quick review of the messages someone has posted can easily reveal whether or not someone has the ability to program.Quote:
Originally posted by oktronic (emphasis added)
you assumingly know how to program
Ok, I just wanted to check to make sure I have not misspoken. Let's see what has happened:
http://www.codeguru.com/forum/showth...126#post851126
This is a message where you violently attack Paul. It could be argued that Paul drew first blood though (at least in this thread-- I do not know if there was other history here). If I had to make an analogy here: Paul may have cut off your finger, but you cut off his head!
http://www.codeguru.com/forum/showth...957#post852957
Yep, I still believe this is a bold statement -- not that someone can write their own code optimized for a particular situation. I believe the two bold claims were (a) that your class was a drop-in replacement for std::vector and (b) your timing results given that the two classes are fundamentally different (you didn't write an array or vector replacement, you wrote a linked list replacement) and the fact that for the given situation, there was a simple, very simple way to optimize std::vector's behavior (you'll find many posts in CodeGuru where someone is told "use reserve()").Quote:
Originally posted by oktronic
Here is my own 'vector' style class. Has all the fat cut out and is drop in compatible with vector.
vector.push_back(i) = 1.2 seconds.
MyTemplateList.push_back(i) = 0.6 seconds.
CArray.Add(i) = 12 seconds.
So as you can see, and feel free to try, user written is fastest (providing done right), STL is middle and CArray is so poor that last is too good for it....
But to give CArray some credit, its far more defensible and easier to debug then STL.
So you decide which you feel is "better"....Again I vote user written because you control the performance and you know you can be at least as fast as the library, have more functionality and be better designed, which is everything you could want in a program or library.
I guess nobody asked for it, but here is some code that shows timing differences between std::list with and without my custom allocator and std::vector. The custom allocator essentially reserves all memory up-front.
Results in seconds:
Here is the code:Code:std::vector with reserve : 0.0115378
std::vector without reserve : 0.0775238
Custom allocator with std::list : 0.0763225
Standard allocator with std::list: 9.09361
C-style array : 0.0121524
[edit: corrected two problems: delete [] in MemoryPool's destructor and fun_std_vector should only reserve TEST_SIZE memory]Code:#include <limits>
#include <iostream>
#include <list>
#include <conio.h>
#include <vector>
#include "../../../utils/utils.h"
const size_t TEST_SIZE = 10000;
const double TIME_MULTIPLIER = 100.0;
const size_t MAX_SIZE = (TEST_SIZE + 1) * 12;
class MemoryPool
{
unsigned char *m_data;
size_t m_top;
int m_refcount;
public:
MemoryPool()
{
m_data = new unsigned char[MAX_SIZE];
m_top = 0;
m_refcount = 0;
}
~MemoryPool()
{
delete [] m_data;
}
void AddRef()
{
++m_refcount;
}
void *alloc_mem(size_t n)
{
if ((MAX_SIZE - m_top) >= n) {
size_t old_top = m_top;
m_top += n;
return m_data + old_top;
} else {
return 0;
}
}
void Release()
{
if (--m_refcount == 0) {
delete this;
}
}
};
template <class T>
class MyAlloc {
public:
// type definitions
typedef T value_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
// rebind allocator to type U
template <class U>
struct rebind {
typedef MyAlloc<U> other;
};
// return address of values
pointer address (reference value) const {
return &value;
}
const_pointer address (const_reference value) const {
return &value;
}
/* constructors and destructor
* - nothing to do because the allocator has no state
*/
MyAlloc() throw() {
m_pPool = new MemoryPool;
m_pPool->AddRef();
}
MyAlloc(const MyAlloc& src) throw() {
m_pPool = src.m_pPool;
m_pPool->AddRef();
}
template <class U>
MyAlloc (const MyAlloc<U> &src) throw() {
m_pPool = src.m_pPool;
m_pPool->AddRef();
}
~MyAlloc() throw() {
m_pPool->Release();
}
// return maximum number of elements that can be allocated
size_type max_size () const throw() {
return MAX_SIZE / sizeof(T);
}
// allocate but don't initialize num elements of type T
pointer allocate (size_type num, const void* = 0) {
// print message and allocate memory with global new
pointer ret = (pointer) m_pPool->alloc_mem(num * sizeof(T));
return ret;
}
// initialize elements of allocated storage p with value value
void construct (pointer p, const T& value) {
// initialize memory with placement new
new((void*)p)T(value);
}
// destroy elements of initialized storage p
void destroy (pointer p) {
// destroy objects by calling their destructor
p->~T();
}
// deallocate storage p of deleted elements
void deallocate (pointer p, size_type num) {
// do not deallocate memory
}
MemoryPool *m_pPool;
};
// return that all specializations of this allocator are interchangeable
template <class T1, class T2>
bool operator== (const MyAlloc<T1>&, const MyAlloc<T2>&) throw() {
return true;
}
template <class T1, class T2>
bool operator!= (const MyAlloc<T1>&, const MyAlloc<T2>&) throw() {
return false;
}
void fun_my_alloc()
{
std::list<int, MyAlloc<int> > l;
for (int i = 0; i < TEST_SIZE; ++i) {
l.push_back(10);
}
}
void fun_std_list()
{
std::list<int> l;
for (int i = 0; i < TEST_SIZE; ++i) {
l.push_back(10);
}
}
void fun_std_vector()
{
std::vector<int> l;
l.reserve(TEST_SIZE);
for (int i = 0; i < TEST_SIZE; ++i) {
l.push_back(10);
}
}
void fun_std_vector_no_reserve()
{
std::vector<int> l;
for (int i = 0; i < TEST_SIZE; ++i) {
l.push_back(10);
}
}
void fun_c_array()
{
int *pl = new int[TEST_SIZE];
for (int i = 0; i < TEST_SIZE; ++i) {
pl[i] = 10;
}
delete [] pl;
}
using namespace std;
int main()
{
CPrecisionTimer ct;
double tim_my_alloc, tim_std_list, tim_std_vector, tim_std_vector_no_reserve, tim_c_array;
cout << "Timing c-style array" << endl;
ct.Start();
fun_c_array();
tim_c_array = ct.Stop();
cout << "Timing std::vector with reserve" << endl;
ct.Start();
fun_std_vector();
tim_std_vector = ct.Stop();
cout << "Timing std::vector without using reserve" << endl;
ct.Start();
fun_std_vector_no_reserve();
tim_std_vector_no_reserve = ct.Stop();
cout << "Timing Custom allocator with std::list" << endl;
ct.Start();
fun_my_alloc();
tim_my_alloc = ct.Stop();
cout << "Timing Standard allocator with std::list" << endl;
ct.Start();
fun_std_list();
tim_std_list = ct.Stop();
cout << "std::vector with reserve : " << tim_std_vector * TIME_MULTIPLIER << endl;
cout << "std::vector without reserve : " << tim_std_vector_no_reserve * TIME_MULTIPLIER << endl;
cout << "Custom allocator with std::list : " << tim_my_alloc * TIME_MULTIPLIER << endl;
cout << "Standard allocator with std::list: " << tim_std_list * TIME_MULTIPLIER << endl;
cout << "C-style array : " << tim_c_array * TIME_MULTIPLIER << endl;
_getch();
return 0;
}
Here is the code for utils.h, which contains just a Windows-specifc high-resolution timer:
Code:#include <windows.h>
class CPrecisionTimer
{
public:
double Stop();
long double ToDouble(LARGE_INTEGER& val);
void Start();
CPrecisionTimer();
virtual ~CPrecisionTimer();
private:
LARGE_INTEGER m_start;
LARGE_INTEGER m_freq;
};
CPrecisionTimer::CPrecisionTimer()
{
QueryPerformanceFrequency(&m_freq);
}
CPrecisionTimer::~CPrecisionTimer()
{
}
void CPrecisionTimer::Start()
{
QueryPerformanceCounter(&m_start);
}
long double CPrecisionTimer::ToDouble(LARGE_INTEGER &val)
{
long double f = val.u.HighPart;
f = f * (DWORD)-1;//4294967296;
f += val.u.LowPart;
return f;
}
double CPrecisionTimer::Stop()
{
LARGE_INTEGER curval;
QueryPerformanceCounter(&curval);
long double f = ToDouble(m_start);
long double f1 = ToDouble(curval);
long double f3 = ToDouble(m_freq);
return (f1 - f)/f3;
}
First a word of warning about the code above. You will need to have a compiler which understands templates better than VC6. I compiled this on VC.NET 2003 in Release mode to get the results.
Now, for the interpretation of the results:
Code:std::vector with reserve : 0.0115378
std::vector without reserve : 0.0775238
Custom allocator with std::list : 0.0763225
Standard allocator with std::list: 9.09361
C-style array : 0.0121524
- The difference between std::vector and the c-style array is minimal and probably due to random system influences. Moving the timing order around will in fact result in these two swapping the top spot. This means that in practice, push_back and array assignment are pretty much the same speed, if the vector has enough memory.
- std::vector without reserve is about seven times slower. Of course, this all goes into reallocation and copying of elements. The one conclusion is obvious, namely that if you can, use reserve.
- std::list with the default allocator is abysmally slow. This time is *mostly* spent in new and delete, as the version with the custom allocator is much faster.
- std::list, whichever allocator you use, uses 3 times as much memory (at least) as a vector or a c-style array. This is because each node has a forward and a backward pointer as well (and my machine is 32 bits). This is inherent in linked lists, so please don't say it's STL's fault
Anyways, what I'm trying to say is that for many problems you have to fiddle around until you find a good solution. STL's containers are a good solution for many cases and if you do find performance problems, it is very likely that you'll be able to sort these out by either switching to a different container or by tweaking the allocator.
I for one think that it is quite impressive what speed gains you can achieve by using a different allocator (here it's 120 times faster). This is one of the nice things about STL that MFC doesn't have. How would you provide a custom allocator for a CArray ? For a CList or a CMap ? The STL is extended much more easily than MFC, if it's possible at all with MFC.
std::map<*>=size=<_Mysize>
std::vector<*>= first=<_First> last=<_Last>
std::list<*> =size=<_Mysize>
std::multimap<*>=size=<_Mysize>
std::set<*> =size=<_Mysize>
std::deque<*> =size=<_Mysize>
std::queue<*> =size=<c._Mysize>
std::priority_queue<*>= first=<c._First> last=<c._Last>
std::stack<*> =size=<c._Mysize>
std::binder1st<*>= op=<op> value=<value>
std::binder2nd<*>= op=<op> value=<value>
std::pair<*> =first=<first> second=<second>
std::less<*>=lessthan
std::greater<*>=greaterthan
adding this to autoexp.dat does not expand it values ..
BULLETMAP::iterator it = m_Map.find(Key(iRowl, iCol));
ASSERT(it != m_Map.end()); // the entry should exist
if(it != m_Map.end())
{
int nRow = it->nRow; // codepoint A
......
......
}
I can view the value of nRow
but I cannot directly view the values of it->nRow or it->nCol or any of its members at codepoint A. This is using VS.Net 2003. It is kinda important for me to know this as there are thousands of lines of STL code which other programmers had written and programmers are not perfect. They can leave bugs here and there and to be able to debug into STL data is a requirement from my prespective. With superfast machines (hardware wise) it is important to be able to debug and track or integrate new features into existing architecture quickly and easily rather than speed.
The time has come where a programmer time is more critical than machine computing time.
I know, but it's better than what VS 6.0 did before.Quote:
Originally posted by voidspace
adding this to autoexp.dat does not expand it values ..
In VS .NET 2003, there is a hook to build your own display string for a given type. Granted this is still not perfect as there is no way to do custom expansions, but it's still better than VS 6.0.
It seems MS is making progress, and I'd still like to see them make things even easier. It's not impossible to find things using the debugger, it's just a mess a right now (finding a specific value in an std::map can require following many, many pointers). But again this is not a design problem of the STL, it is something that the maker of the debugger (MS is this case) could fix if they put enough effort into it.
Actually, this is not true. You could create an add-in dll and tell VS to use it in the autoexp.dat file. You can then write the code in the dll to do a custom expansion.Quote:
Originally posted by KevinHall
In VS .NET 2003, there is a hook to build your own display string for a given type. Granted this is still not perfect as there is no way to do custom expansions, but it's still better than VS 6.0.
The only documentation I can find in MSDN concerning using the add-in dlls is a sample DLL project which formats times. I did not see any way to do expansions using that API. If there is a way, great! I would love to see someone (possibly even me) make a DLL that would perform some more meaningful expansions for the STL. Can you point me somewhere where there is more meaningful documentation on the feature? (Really, I did search MSDN and the web. I just could not find anything beside that one sample dll).
Well, maybe I didn't exactly ask...Quote:
Originally posted by Yves M
I guess nobody asked for it, but here is some code that shows timing differences between std::list with and without my custom allocator and std::vector. The custom allocator essentially reserves all memory up-front.
;)Quote:
Originally posted by Paul McKenzie
The STL containers aren't constrained into doing the default memory management. I don't see why a custom allocator couldn't be written for std::list if "reserved memory" were an issue.
Yves, thanks for the informative post. You should post it in the C++ FAQ.
Regards,
Paul McKenzie
I agree with Paul -- Yves, that was a great post. How to use non-standard allocators would be a good FAQ too.
- Kevin
Try putting this in the watch window:Quote:
Originally posted by voidspace
I can view the value of nRow
but I cannot directly view the values of it->nRow or it->nCol or any of its members at codepoint A. This is using VS.Net 2003. It is kinda important for me to know this as there are thousands of lines of STL code which other programmers had written and programmers are not perfect. They can leave bugs here and there and to be able to debug into STL data is a requirement from my prespective. With superfast machines (hardware wise) it is important to be able to debug and track or integrate new features into existing architecture quickly and easily rather than speed.
iter->second
or if you were using VS6 you would put:
iter._Ptr->_Value
or
iter._Ptr->_Value.second to see the actual value.
Yves,
I don't see the
in the MemoryPool class. Let me know if I'm wrong on something.Code:delete [] m_data;
Regards,
Paul McKenzie
Yves,
I did specifically ask for it and thank you!
There are a few things I noticed that will make this a problem in VC6 so, if its ok with you, I'll modify the code a little to make it compatible. Then slap Kevin's class in here to for the "user-defined" class, and then offer a few optimizations, if that’s ok.
Needs a few things, like _Charalloc, from my quick perusal.
And as Paul noticed, its missing the destructor for deleting m_data.
Nothing serious though.
A top rate response! I'll get back to you hopefully this afternoon.
Thanks again, much appreciated.
Sorry, you are of course right, the destructor of MemoryPool should free the memory. First I was using a std::vector, that's why I didn't think about it :/Quote:
Originally posted by Paul McKenzie
Yves,
I don't see the
in the MemoryPool class. Let me know if I'm wrong on something.Code:delete [] m_data;
Regards,
Paul McKenzie
I have corrected the original code.
Of course, VC6 doesn't handle templates correctly and hence the whole rebind mechanism cannot work in the standard way. If you want to implement a rebind for VC6, you will have to write code that is implementation-specific for the STL you are using. If you use STLPort, then the way to do it is to inject your specialisations into the _STLP namespace. There is a thread somewhere on the Non-VC forum where that is detailed.Quote:
There are a few things I noticed that will make this a problem in VC6 so, if its ok with you, I'll modify the code a little to make it compatible.
Oh, by the way, I didn't give any credits, but as you can probably guess, the MyAlloc allocator is not actually mine (hence the non-sensical comments), but it's taken from Jossutis' book on STL.
What is _Charalloc ?Quote:
... like _Charalloc...
Hi Yves,
I think you might find this interesting, I did a quick drop in and run of the code, as expected, due to VC6 there were a few errors, so I commented out the custom allocated list for the moment and ran the test with the others just for fun and I found this to be interesting:
std::vector with reserve : 0.032881
std::vector without reserve : 0.030954
Standard allocator with std::list: 1.207932
C-style array : 0.014080
What I think is interesting is the vector with and without the reserve. I think this can be chalked up to the VC6 compiler pumping out some inefficient code. I'll run it with .NET later to compare. But also std:list is off from your result to mine, but .NET has exception handing in the STL code now, so that might account for it. I have a release .NET on ext HD that hopefully I'll get around to using to test this, but got a lot to do before Thanksgiving.
Which reminds me, if I don't get around to it today, I hope you all have a great Thanksgiving!
PS: _Charalloc is used by list for allocation. its a member of the allocater class, its in "XMemory":
it gets called in list in _BuyNode, in list:Code:char _FARQ *_Charalloc(size_type _N)
{return (_Allocate((difference_type)_N,
(char _FARQ *)0)); }
Its just one of the VC6 things I believe, nothing that can't be easily fixed.Code:_Nodeptr _Buynode(_Nodeptr _Narg = 0, _Nodeptr _Parg = 0)
{_Nodeptr _S = (_Nodeptr)allocator._Charalloc(
1 * sizeof (_Node));
_Acc::_Next(_S) = _Narg != 0 ? _Narg : _S;
_Acc::_Prev(_S) = _Parg != 0 ? _Parg : _S;
return (_S); }
This can't be right, unless the Dinkumware STL for VC6 is outright bad. Try moving the calls to the different functions around in main(). The problem I have with all these timing examples is that:Quote:
[i]Originally posted by oktronic
std::vector with reserve : 0.032881
std::vector without reserve : 0.030954
Standard allocator with std::list: 1.207932
C-style array : 0.014080
a) application startup time may have an influence (so the first function may have a longer time than expected)
b) memory fragmentation issues. Try putting any other function behind the std::list without custom allocators and you'll see its performance drop.
Oh, and by the way, there was another error in the code. The reserve should only reserve space for TEST_SIZE integers, the 1 million is a leftover from when I had that parameter fixed. So that may account for the discrepancy you get.
Yes that may be a slowdown factor. But then again vector.push_back could also throw an exception, no ?Quote:
But also std:list is off from your result to mine, but .NET has exception handing in the STL code now, so that might account for it.
Ok, that's what I was referring to when I said that custom allocators for VC6 have to be hard-coded with respect to your current STL. Very ugly :/Quote:
PS: _Charalloc is used by list for allocation. its a member of the allocater class, its in "XMemory":
Hi Yves,
Yeah, VC6 was horrible at doing templates, which is one of the reason alot of people avoided it. It has a very bad habit of getting all confused when you are using multiple templates, especially when they are in the same function as they are here. I'm pretty sure that if I move them around that I'll drop performance on one and gain on on another, as you suggested.
I made the change to vector and the BM for it is now 0.023746. bounced around a lot at this res, but it stayed -.01 from its non-reserved buddy. Sorry I didn't notice that, I didn't look too hard at the functions since its was a quick compile and run. I just dropped it into a project I'm working on just added a messagebox at the end to display the results. I'll need to remember to get it all out before my proj gets out the door, lol.
Anyways, thanks again. I'll get back to you with comparisons as soon as I get the chance.
The timing for std::list under both VC++ 6 (from oktronix)
and VC++ NET (from Yves) are very disrtubing. Here are
my results using VC++ version 5 (on a 450MHz laptop) :
Timing c-style array
Timing std::vector with reserve
Timing std::vector without using reserve
Timing std::deque
Timing Standard allocator with std::list
std::vector with reserve : 0.144907
std::vector without reserve : 0.334736
std::deque : 0.205166
Standard allocator with std::list: 1.01896
C-style array : 0.0716573
(I also added std::deque).
I also ran the code using g++ 3.2.2 under RedHat Linux.
I left the results at work, but they were similar to above.
std::list with the custom allocator code being about 3 times
faster than with the standard allocator. Under g++, I don't
have a high performance timer, so I just used clock() and
jumped the number of elements up to 1 million.
I'm baffled... I just tried the code (without the custom allocator) in VC6 and get these results:
Why is std::list so slow in VC.NET ? I have tried disabling exceptions, but that makes no difference at all. Is it implemented in a completely different way or something ?Code:// VC6 with STLPort
std::vector with reserve : 0.00840023
std::vector without reserve : 0.0217835
Standard allocator with std::list: 0.106455
C-style array : 0.00862987
// VC.NET 2003 with Dinkumware
std::vector with reserve : 0.00846281
std::vector without reserve : 0.0431672
Standard allocator with std::list: 21.2144
C-style array : 0.00862149
Hum, maybe a partial answer is the fact that inserting an element in VS.NET's STL updates an initial size count, while STLPort doesn't do that, but that doesn't quite account for a 200-fold increase...
Another answer may be that memory management was changed between the runtime for VC6 and the one for VC7, so that small objects occur a higher penalty. I'll try this and let you know.
Mystère et boule de gomme ;)
[edit:
I officially don't get it. The actual thing that is strange is that std::list is as fast as it is in VC6 in the first place. I tried the following code:
I set TEST_SIZE to 100 thousand and the multiplier to 10, so that the results are somewhat the same as previously. I had to run the tests for std::list and the fun_new_test separately, since either one of them fragments memory much too heavily, so results would be skewed in favour of the first one that gets called.Code:void fun_new_test()
{
int *a[TEST_SIZE];
for (int i = 0; i < TEST_SIZE; ++i) {
a[i] = new int;
}
for (i = 0; i < TEST_SIZE; --i) {
delete a[i];
}
}
Results:
Maybe the solution to this riddle would lie in the fact that STLPort already uses a better allocator than just plain new ? It would certainly explain these results, but I'm not quite sure what the explanation would be for VC6/5+Dinkumware to be faster than VC.NET+Dinkumware...Code:// VC6
Standard allocator with std::list: 0.127351
Standard allocator (new_test) : 13.2146
// VCNET
Standard allocator with std::list: 17.2856
Standard allocator (new_test) : 15.2608
Yves, try VC 7.0 using STLPort.
Regards,
Paul McKenzie
I'd love to, but that'll have to wait until I have a bit more time to set up STLPort under VC.NET. Actually, it's VC7.1 (VC.NET 2003), I don't know if that makes any difference.Quote:
Originally posted by Paul McKenzie
Yves, try VC 7.0 using STLPort.
Regards,
Paul McKenzie
Thank you very much ..Quote:
Originally posted by Adam Gritt
Try putting this in the watch window:
iter->second
or if you were using VS6 you would put:
iter._Ptr->_Value
or
iter._Ptr->_Value.second to see the actual value.
it._Ptr->_Myval.nRow did the trick.
Ok, here are the results for VC.NET 2003 with STLPort 4.6 (the current version)
On the same machine, also with VC.NET 2003, the Dinkumware STL gives me:Code:std::vector with reserve : 0.016376
std::vector without reserve : 0.0394023
Custom allocator with std::list : 0.0794902
Standard allocator with std::list: 0.152983
C-style array : 0.0124926
Code:std::vector with reserve : 0.0172944
std::vector without reserve : 0.0637003
Custom allocator with std::list : 0.0883701
Standard allocator with std::list: 52.548
C-style array : 0.0123338
Wow. Maybe you should drop P.J. Plauger at Dinkumware an e-mail with your findings.Quote:
Originally posted by Yves M
Ok, here are the results for VC.NET 2003 with STLPort 4.6 (the current version)
On the same machine, also with VC.NET 2003, the Dinkumware STL gives me:Code:std::vector with reserve : 0.016376
std::vector without reserve : 0.0394023
Custom allocator with std::list : 0.0794902
Standard allocator with std::list: 0.152983
C-style array : 0.0124926
Code:std::vector with reserve : 0.0172944
std::vector without reserve : 0.0637003
Custom allocator with std::list : 0.0883701
Standard allocator with std::list: 52.548
C-style array : 0.0123338
Regards,
Paul McKenzie
I just did, let's see what he replies. I guess it's a matter of allocators. C.f. the notes on allocators on SGI's pages:
- alloc
The default allocator. It is thread-safe, and usually has the best performance characteristics.- pthread_alloc
A thread-safe allocator that uses a different memory pool for each thread; you can only use pthread_alloc if your operating system provides pthreads. Pthread_alloc is usually faster than alloc, especially on multiprocessor systems. It can, however, cause resource fragmentation: memory deallocated in one thread is not available for use by other threads.- single_client_alloc
A fast but thread-unsafe allocator. In programs that only have one thread, this allocator might be faster than alloc.- malloc_alloc
An allocator that simply uses the standard library function malloc. It is thread-safe but slow; the main reason why you might sometimes want to use it is to get more useful information from bounds-checking or leak-detection tools while you are debugging.
And yes, when I change STLPort's default allocator to use malloc then I notice things slowing down as well:
For this program:Code:Default STLPort alloc malloc_alloc Dinkumware
0.0027 0.1522 0.1410
0.0054 0.1884 0.2193
0.0085 0.5601 0.5145
0.0123 1.1903 1.1231
0.0165 1.9820 1.9543
0.0208 3.0880 3.1050
0.0254 4.4912 4.6766
0.0331 8.4089 8.7176
0.0368 16.9519 16.5752
Code:std::vector<int> g_data;
void set_test(size_t max)
{
std::set<int> s;
for (size_t i = 0; i < max; ++i) {
s.insert(g_data[i]);
}
}
int main()
{
for (int i = 0; i < 100*1000; ++i) {
g_data.push_back(i);
}
srand(171);
random_shuffle(g_data.begin(), g_data.end());
CPrecisionTimer ct;
for (i = 1; i < 10; ++i) {
ct.Start();
for (int o = 0; o < 10; ++o) {
set_test(i * 1000);
}
cout << ct.Stop() << endl;
}
return 0;
}
I think it would be very enlightening for students and professionals-just-starting out and needing to make decisions such as whether or not to use the standard containers in their code to read this thread all the way through. I've done it at points where I've decided to post to ensure I'm keeping up with both the current and past spirit of this thread, and I just did it again and was struck by some observations that I feel the targets I mentioned should also be aware of. Its been pointed out several times by several posters, but...
...particular points of those who advocate not using the standard containers... keep shifting.
There is this shell game being played. Somehow, the topic of particular bugs gets dropped, and yet somehow the debugabillity is still called into question. Design issues are mentioned at several points, yet particulars get clouded over by a fog of "ugly" and "difficult". When particulars do come up, they are responded to quickly, and despite the fact that so far the most of the issues have been ones of understanding how to use the standard containers (not deriving but extending orthogonally), what their guarantees are (algorithmic complexity issues), and the language features they use (templates and namespaces), well...
...there is never an acknowledgement of those issues being resolved.
I don't like flames (much), but souldog's earlier comment about obsolescence keeps bouncing around in my head. I'm not sure how to word this without sounding like I'm attacking someone, but let me try. I have seen similar opposition to using the standard library containers in person. And I have come to believe that much of the opposition is a psychology, a tiredness, a "I don't want to have to learn yet another thing." When I got out of college I was so exhausted with learning that I didn't pick up a technical book for close to 3 months. I know that feeling innately, it is a constant struggle for obsessive people like myself, and I see the exact same symptoms in responses in this thread.
Specifics cases against the standard containers are avoided or incorrect (algorithmic assumption as a case in point) because they are not known, yet.
And the whole point is to keep the task of learning away.
And sometimes a defensiveness creeps in, and there is this explosion of sound and fury, which is supposed to be one of those societal signals to back off and let the issue die out.
And I don't want it to die like that.
I believe that a lot of those who come to codeguru to read the forums want solid answers to the questions they have in their professional life.
I believe that there are actual, scientific metrics on which programming decisions should be made. We don't suggest to a person requesting information on how to make a driver to use Visual Basic. We don't suggest to a project which carries a strong requirement for stability and correctness (life support systems, for example), that the team use a language or language features they are not already intimate in. Sure, some of the metrics can have poor precision, and possibly there might be a problem choosing between several courses of actions do to different suggestions by different metrics. But even that type of case needs to be established, not passed over as an initial assumption.
Some very good metrics I have seen in this thread included component construction and time. Those are great metrics because they are the basis for the industrial revolution. When John Hall built the first rifle factory to use interchangeable parts, the entire point was a realisation of the dreams of Thomas Jefferson and Eli Whitney to get some degree of automation possible in construction to increase production capabilities by decreasing the time for construction of the various phases.
Software design has been undergoing its own industrial revolution. And just as in the physical construction sector, software domains have many concepts of components and various dynamics for connecting and interchanging them.
Look at the evolution of MFC's containers. Originally, they were built in the zeitgeist of object-oriented design. So there were some basic object containers for holding pointers to any class derived from CObject. This allowed for a useful type dynamic where objects of various different concrete types could coexist inside a container and only the algorithmic specification of the container type was exposed. This was the polymorphic method of generic programming. However, it had to be stated outside of the code that such containers were most useful when the actual classes were all derived from a base furthur down the hierarchy, since one often uses such containers in code to loop through a more refined interface, and code could easily litter with casts as one tries to expose the interfaces. Some of that gets cleaned up if you derive from the containers, and MFC even provided some utility derivations for things like their string utility class, but even at this stage you begin to see the coupling involved in acquiring a solution, and the inability for the framework to provide a solution for those who do not desire the CObject interface with its emphasis on application core features like serialisation.
Then we get language support for templates and eventually the MFC engineers start building in support for newer containers that take advantage of the template mechanism. However, to maintain backward support, their core framework is stuck with object relationships built in the object-oriented days. They desire these new containers to have support for the framework built in, so they are placed into the framework hierarchy. And they do not demonstrate some of the newer componentisation techniques made available by templates (I suspect because the engineers were unfamiliar with the techniques), and they also fail to even expose older componentisation techniques like iterators. So they restrict their use cases.
But then you have David Musser and Alex Stepanov working over in Ada on this idea for a generics library. With the help of people like Meng Lee, there was all of this intellectual discussion about library design and abstraction in c++ going on over at HP. With heavy use of the iterator abstraction, factoring out allocation policies, and generic type access, the STL was built concentrating solely on the benefits of abstracting out the ideas of object relationships and algorithms applied to them. These are the fields that complexity theory and combinatorics had already well established a mathematical framework for, and there were widely known efficient algorithms that were useful all over the place. Using templates also had the benefit of allowing the compiler to use the static instantiation to make type-specific optimisations and much of the code necessarily inlined itself due to template visibility issues. The components were clear and customisable. One could add algorithms for entire classes of iterator abstractions. One could root a hierarchy on a container of polymorphic pointers however one desired.
Even though MFC's newer template containers shared some of the benefits that the STL provided, they were still monoliths rooted in a framework hierarchy without an iterator abstraction for generic algorithmics. STL's containers were free objects and algorithms, customisable and transportable. And they're faster, about as fast as a container can get since the modern optimisers use all of the type information available to make specific access calls almost exactly the same as if there were no objects and only naked calls to a free structure (for instance, vector read access is very often found in assembly as a simple pointer add offset and dereference load). In fact, the only suggestions I have ever seen for improving the efficiency of the standard containers and algorithms (which I reiterate are built on mathematical notions well studied for efficiency) has been the proposal for new language semantics for move constructors to keep out the unnecessary copy or two that goes on in the language construction mechanism.
On these metrics, STL happens to have the edge by quite a bit. It is much more useful as transposable components, it is faster both in milliseconds and actual labor-hours for immediate algorithm reuse. And now, these containers are standardised and an integrated part of the legal language we program in. These are all of the metrics that point to the STL being the current leader in this particular niche of the industrial revolution. Other competitors in this field of components, like the Booch c++ Components, also fail the challenge for similar reasons.
I'm not saying that something better can't come along. There are certainly abstractions on object relationships inside a container and some other mechanisms of adaptability which we have discussed in this thread could lead to even better abstraction and reuse potential in the standard containers. Hand rolling such a library is not worth it when you are an actual part of a work force with job task lists, iteration time schedules, and the need to get things done now so your company can grow. It is the type of task that one takes on when that is the goal, as a library designer or secret dabbler in the black arts. Its exactly like elsewhere in today's modern, industrialised society. There are widget manufacturers and factories which consume widgets to build whatchamacallits. If you are on the assembly line of the whatchamacallit factory, pointing out that you can make a better widget will not get you noticed by the foreman unless you can actually produce designs and plans which you can show bring benefit to The Vendor. Consider me, now, a foreman (as well as a worker on the assembly line). I'm not saying the better widget is not possible. I'm only saying that you better have designs and plans available to point out where and how the widget can be improved before you run around telling all the workers on the assembly line to stop what they are doing and waste time building their own better widgets first. If everyone did that, we'd be back to pre-industrialised times with one-pass assembly.
I'm also not saying that building your own containers is not a great learning experience. Obviously it is, and most data structures classes emphasise hands-on learning (take a look at some of the questions asked on these boards). Skills need to improve with time. But there also needs to be a point where some of that skill learning gets focussed on learning the useful existing tools for our times...
Now, I just wanted to finish up by pointing to a problem I see in something said earlier. Just to make it perfectly clear and explicitly stated in this thread, the discussion about system level memory being done in terms of list linking was pure smokescreen (another of the symptoms I mentioned earlier). As someone who has actually programmed a few drivers, I understand quite clearly how the system makes the transformation between virtual and physical addresses. It is all done through the descriptor tables the processor already is being pointed to internally, they are page lists (so we are looking at access only in increments of the page size), they are quickly indexable, and the transformation occurs for all virtual addresses mentioned in a machine instruction. What all this means is that the process of transformation is processor level process that influences not a single bit the order of complexity calculations for an algorithm. Pointer arithmetic is still a simple addition to the processor, it only occurs prior to transformation if an access is required. There is nothing at all about the process of address transformation that suddenly makes the vector guarantees the same as a linked list, or anything freaky like that. I know some of that was already answered, but I wanted to emphasise it for any of the future readers of this thread.
First, thanks so much for your words, they are useful to me. I have learnt a lot from them. True...:)
The underlined sentences in the quote above are what I like most. That's the way it is...I guess.Quote:
Originally posted by galathaea
.......
Now, I just wanted to finish up by pointing to a problem I see in something said earlier. Just to make it perfectly clear and explicitly stated in this thread, the discussion about system level memory being done in terms of list linking was pure smokescreen (another of the symptoms I mentioned earlier). As someone who has actually programmed a few drivers, I understand quite clearly how the system makes the transformation between virtual and physical addresses. It is all done through the descriptor tables the processor already is being pointed to internally, they are page lists (so we are looking at access only in increments of the page size), they are quickly indexable, and the transformation occurs for all virtual addresses mentioned in a machine instruction. What all this means is that the process of transformation is processor level process that influences not a single bit the order of complexity calculations for an algorithm. Pointer arithmetic is still a simple addition to the processor, it only occurs prior to transformation if an access is required. There is nothing at all about the process of address transformation that suddenly makes the vector guarantees the same as a linked list, or anything freaky like that. I know some of that was already answered, but I wanted to emphasise it for any of the future readers of this thread.
This is just a question stated as an answer.Quote:
Originally posted by mclark
Yes, I know about the reserve and swap trick but this doesn't help. (I can provide an example if you really want) but the main point is it is impossible (as far as I can tell) in STL to use std::vector<T> is such a way that memory use is linear to the number of elements (capacity is never greater than high watermark size).
int some_stuff[] = {1,2,3,4,5}
std::vector<int> vect_of_stuff(some_stiff, some_stuff + sizeof(some_stuff)/sizeof(some_stuff[0]));
the capacity of a vector initialized in this way is equal to its size.
Is there a reason why this does not refute the above claim?
Hi galathaea,
Just wanted to comment on a few points.
I have to disagree with this. Many of the people who have held opposition to STL have shown they understand it and have chosen to not use it. I have seen many propose that Darwin (who started this thread) simply doesn't understand STL and that's why he doesn't like it, even though he has shown several times that he does understand it and has rejected it, with several valid points. There is nothing complex about using STL. If you understand templates you can understand it. I feel that Darwin understands templates, and him not liking STL does not mean he just "doesn't get it" as suggested.Quote:
And I have come to believe that much of the opposition is a psychology, a tiredness, a "I don't want to have to learn yet another thing."
"ugly" and "difficult" are two major negative design flaws in any program. These are the "particulars". So I'm not too sure what you mean by "clouded over by a fog" since these are two of the major points in that reference. And again, I believe the main players in this thread understand STL, its purpose and its implementation.Quote:
Design issues are mentioned at several points, yet particulars get clouded over by a fog of "ugly" and "difficult". When particulars do come up, they are responded to quickly, and despite the fact that so far the most of the issues have been ones of understanding how to use the standard containers (not deriving but extending orthogonally), what their guarantees are (algorithmic complexity issues), and the language features they use (templates and namespaces), well...
One of the points proposed in this thread has been that simply, anything that STL can do, you can write yourself. Meaning that STL is just code. It is not built into the construct of the compiler, so there is nothing that it can do, that anything else can't do. So STL doesn't have a edge over anything else, since with a little effort, anything else can do the exact same thing. And since STL runs exception handling (in .NET, not VC6), which has a overhead, if you reproduce the STL code yourself, and omit the exception handling, you already have a performance boost over it. You can then systematically remove unnecessary code or optimize the code, you then have another performance boost over it. I know the first thing said will be that STL exception handling is the best way to handle errors, as its been already proposed. Unfortunately, its an overhead and doesn't provide efficient enough handling of errors. Even MS's GetLastError() is more efficient by simply returning a code to deal with the error with out the overhead. A simple "Only do work when there's a problem" approach that MS API has keeps code running fast when there isn't a problem. I personally think that if there is no error, then I don't want anything to take a hit.Quote:
On these metrics, STL happens to have the edge by quite a bit. It is much more useful as transposable components, it is faster both in milliseconds and actual labor-hours for immediate algorithm reuse
Besides that however, the design flaws (which I'll agree are much better in .NET now that they have dropped the R&D code), such as hard to read and difficult to integrate into a complex environment, make STLs "edge" minimal.
As for the amount of "man-hours" saved, they are only saved if there isn't a problem and only lost if you don't already have your own libraries to use. If you have to debug STL, which in a production environment is done by "debuggers", which usually consist of fresh college grads who start there to gain experience, then the "man hours" are lost to inexperience in the cycle. Which is a negative to STL in production.
Since you are referring to what I said, I think you misunderstood. But since you feel that I don't understand, lets analyze what you said. Please understand that I wish to only discuss what you said, not "flame" you. I know I come off a little abrupt to some, but understand that its not my intent.Quote:
the discussion about system level memory being done in terms of list linking was pure smokescreen (another of the symptoms I mentioned earlier)
Quote:
the discussion about system level memory being done in terms of list linking was pure smokescreen
It does bother me a little that you call the claim that the system is a linked list configuration a smoke screen, then use the term page lists to describe it. So lets take a look at the structure that the system uses. Forgive any missing variables, but I'm running off my own memory since I don't have any documentation with me and its been a while since I've had to deal with it. MSDN has some articles on VM and how the system handles it.Quote:
It is all done through the descriptor tables the processor already is being pointed to internally, they are page lists (so we are looking at access only in increments of the page size), they are quickly indexable, and the transformation occurs for all virtual addresses mentioned in a machine instruction.
I will provide this link which I hope will help illustrate, just remembered this is out here, I'm sure there are better:
http://www.windowsitlibrary.com/Content/356/04/1.html
This talks about how memory works in NT systems, the theory has evolved since NT in 2000 and XP, however, the principle is more or less the same.
Here is a very basic structure.
struct memoryPage{
void* data;
UINT size;
memoryPage* next;
};
Since the "next" points to the next page, this is a linked-list obviously. This is why windows has a linked-list implemented MMU or memory management unit. This is what I was talking about. I understand I confused several people with what I said, but I rush a little too much when I write and that's why I provided some links to help illustrate the point to clarify. But as you mentioned, people start getting defensive, and as a result information gets twisted.
What this boiled down to as a point is, that the way the system handles your memory does not mean that "contiguous" is any faster then a user-managed linked list. It depends on the page size and the efficiency of the code to manage it, which again is demonstrated in my original articles. This is why I ask you to try both. Create a pool of memory and manage it yourself and then try the system an benchmark the results. Granted I still need to do the work I promised and post it, but Thanksgiving is in the way, but when I've got access to my comp, I will do as promised.
Again, I think we are talking about different things. The system attempts to store data in physical memory in such a way that the memory can be easily swapped between disk and physical, yes. Data is stored in several complex tables which do several things, two of which would be to protect system and application integrity, and provide efficient retrieval. However, you can't have super fast retrieval and safety. Safety will have overhead as there are system memory tables and application memory tables to ensure that you can't interfere or crash anything but your own program. The system must traverse these tables anytime you wish to reference your memory. So if you do the work of the system in your own program, then you could remove the hit the system would normally generate. Since you have programmed drivers, you can see how this is very possible.Quote:
it only occurs prior to transformation if an access is required. There is nothing at all about the process of address transformation that suddenly makes the vector guarantees the same as a linked list, or anything freaky like that.
So to sum up, a question of system performance came in to play and an explanation of how the system worked was necessary to quell the belief that you are guaranteed that any memory you get is in any way "contiguous". I hope this clarifies that point.
Quote:
Originally posted by souldog
This is just a question stated as an answer.
int some_stuff[] = {1,2,3,4,5}
std::vector<int> vect_of_stuff(some_stiff, some_stuff + sizeof(some_stuff)/sizeof(some_stuff[0]));
the capacity of a vector initialized in this way is equal to its size.
Is there a reason why this does not refute the above claim?
I do not think this refutes my claim.
The standard guarantees that the invariant of your construct (calling a vector's range constructor) is that the size is equal to the number of elements within the range (provided the range is valid).
so size = 5 is guaranteed but capacity = 5 is not.
This holds for the key-value constructor, copy constructor, resize member function, erase member functions and clear member function. None have a guarantee for capacity.
std::vector<int> Example(0);
Example.empty() == true; Example.size() == 0; Example.capacity() == ?
The only vector member function that I think might have a hard guarantee on reducing capacity (reserve obviously makes one on increasing it) is swap. But swap cannot be performed by iterator - and hence not with a normal array (where you control the size).
I thankfully concede that this is a 'standards' question. I know of no STL implementation so brain-dead that the constructors discussed can't be used to control size (but I bet they exist out there somewhere right now - and they're probably still 'standard' conforming)!