Hi,
I want to know if using STL vectors slows the execution of program. I have used a lot of vectors in my code and its taking around 400 seconds which the same code written in Lisp took 10 seconds. Are there any general aspects to be taken care of which can improve the speed.
Thanks.
Srinivas.
Alexey B
July 22nd, 2002, 05:55 PM
Using iterators instead of subscripts can increase the speed a great deal.
stober
July 22nd, 2002, 07:08 PM
I don't know diddly squat about Lisp so I can't compare it. How does Lisp do it?
A great deal depends on what you are doing with all those vectors. Can you substitute simple arrays of char** instead? such as:
char* vector1[255];
char* vector2[255];
The above may be a good solution to only the simplest cases.
Another possible alternative is to use some kind of linked list class that does not have all the extra baggage of the vector template.
jfaust
July 22nd, 2002, 07:40 PM
Srinivas,
By all common sense and by everything I've heard, C++ should be faster than Lisp. There's probably something you are doing incorrectly. If you post an example of how you are using it, or all the code if it's not too long, I'm sure somebody will be willing to give a few pointers, pun not intended.
But actually, if this is slowing the program down, you may need to go to C arrays to get the speed you need. By doing this, you sacrifice maintainability for the purpose of speed. This is an expensive move even if it gets the results you seek, and should only be done after careful consideration.
There are also profilers available that can tell you where the application is spending its time.
What's the application? Are you accessing the data for use in numeric calculations?
Jeff
Gabriel Fleseriu
July 23rd, 2002, 07:43 AM
Jeff: If Srinivas' measurements are correct, we're talking about a 40 times faster execution of Lisp on C++. Switching to C is not an option, IMO - he has to check the algorithm he uses in C++ first, and to profile the C++ code to see where the execution time is going.
IMO, switching to C would be a mistake, like that made by many programmers who want to speed up a poorly designed algorithm by switching to assembler.
Aside (or maybe not):
I remember running into a speed problem with std::vector. I had to parse a memory block who's representation was astd::vector<std::vector<int> >The problem was that you couldn't tell the size of each inner vector until you had parsed the block (The size could be as less as 3 or as much as 16'000). So I first used push_back() w/o reserving any spce for the vectors, and let them do the job with allocating the memory. That lead into a chaos of memory blocks being copied and re-copied, making the function painfully slow. As I reserved a (over the thumb) value of 10 integers for the inner vector (seemed to be a good mean value), and ran the same test harness, I got a speed-up of more than 100x!! std::vector managing the memory for you is in most cases *a good thing*, but that does not mean that you should turn off your brain, as I did :)
This article (http://www.gotw.ca/gotw/074.htm) made a few things clear to me. I think it's worth reading.
jfaust
July 23rd, 2002, 08:54 AM
Gabriel,
Yes and yes--I was saying basically the same thing and tried to stress the move to C arrays should be a last resort. However, even a 100x increase will still cause this to take 4 seconds. Depending on where this is happening, it's still a long time. As you move away from the generality of STL, you do gain some more options for optimization. The trade-off for this is maintainability. Maintainability is almost always a more important consideration than performance, and this trade-off should not be made lightly.
More information on the application would be helpful in determining a good solution.
Jeff
AnthonyMai
July 23rd, 2002, 10:19 AM
When you are talking about whether the result comes out on the screen in 0.01 seconds or 0.1 seconds, performance is not an issue and maintainability is more important.
But when you are talking about something that takes 400 seconds to get the result, maintainability is not an issue. And even performance is not an issue. Because you do not have something that has any usability value at all to maintain or to improve performance. Your first job is create something that can be used at all.
It is possible that by improving your algorithm, you should be able to gain significantly in speed, even if you still use STL. But the problem with STL is there are so much going on underneath within STL, that a seemingly simple code line will easily kill you in performance and you never know why.
Among other things, objects are frequent back and over again and again within STL, most of times it is totally un-necesary object copying. This along could easily kill you in performance if unfortunately the objects you deal with is not trivial to copy.
I encourage people to look into the source code of STL and judge by themselves whether they want to use piece of code like that, or they are allowed to write code in that kind of style, after they have seen the STL source code for themselves. Lack of knowledge of STL internals is the most frequent cause of mis-use of STL by programmers.
jfaust
July 23rd, 2002, 10:44 AM
I think we're in agreement when it comes to performance and maintainability, but I'd like to clarify. Maintainability should be the primary focus until there is demonsratable and specific performance problem. Then that problem should be examined and fixed appropriately, and with eyes wide open about what you are giving up in order to improve performance.
Now as for usability, it depends where the time is being used. Some operations are necessarily lengthy. It all depends on the expectations of the user. Simply navigating around an interface should be quick. But if you are setting up a simulation that attempts to reproduce physical phenomena, it could take minutes, hours, or days.
Jeff
AnthonyMai
July 23rd, 2002, 03:42 PM
Now as for usability, it depends where the time is being used. Some operations are necessarily lengthy. It all depends on the expectations of the user. Simply navigating around an interface should be quick. But if you are setting up a simulation that attempts to reproduce physical phenomena, it could take minutes, hours, or days.
I believe usability stops and performance issues starts at where human user starts to wait for the machine to produce result. The reason behind it is human time is much much more precious than machine time.
The cut off line is whether the user has to wait or not. This cut off line is true for things as simple as GUI or as complicated as massive scientific calculations.
In the first case if the time it takes the machine to produce a visual output is much longer than the time it takes the user to percept what he saw on the screen and response, you have a usability issue.
In the second case, if the time it takes for a computer to produce result of scientific calculation is longer (one year) than the time for the researcher to analysis the result and write a research paper and send it out for publishing (one month), then there is a big usability issue.
I don't care what you are trying to calculate, if the profession has to spend more time waiting for the machine to produce result, than the time it takes the profession to think about the research and write research papers, then the professor's time would be much better spent on applying for more fund to buy faster super computer, or on trying to develop a better algorithm that reduces calculation time, rather than just sit and wait for the machine.
jfaust
July 23rd, 2002, 03:47 PM
The point is that when "talking about something that takes 400 seconds", you need to take into account what that something is.
Jeff
skulkarni
July 23rd, 2002, 05:04 PM
thanks to all,
well my question has given rise to nice arguments. I am using the same algorithm and the same logic that has been used in LISP. Theroetically c++ is faster than LISP. I have even tried making changes as suggested by Gabriel, i.e allocating approximate amount of memory in the beginning and thus avoiding push_back() (which i have used at many places) but still the improvement is not significant.now its taking around 350 seconds. well i am attaching one of the major files.
skulkarni
July 23rd, 2002, 05:20 PM
well here is the file
skulkarni
July 23rd, 2002, 05:21 PM
and another major file
Paul McKenzie
July 23rd, 2002, 05:28 PM
As someone else pointed out, what you need to do is to use a code profiler and profile your code. If you don't profile your code, you could be on a wild goose chase, thinking that "this is the part of the code that's slow", and waste time trying to optimize something that doesn't need optimization. A profiler will quicky tell you which functions, lines, etc. that are taking up the most time.
Also, you should attach the entire program source, not just one file, along with an int main() stub that demonstrates the problem.
Regards,
Paul McKenzie
Paul McKenzie
July 23rd, 2002, 05:40 PM
I took a quick glance at one of the files. I didn't get a chance to compile it, but I think that your problem is that you don't ever use references when you are passing your vectors around. You are passing vectors by value instead of by reference, making the compiler create unneccessary copies of the vector.
You are passing vector<list_values> by value, meaning the compiler makes an entire copy of the vector. This may be what is taking up a lot of the time. Unless this is your intention, to pass a vector by value, then try this instead:
Also post the .h files. The goal is to have it so that any one else here can compile and run your program.
Regards,
Paul McKenzie
Gabriel Fleseriu
July 24th, 2002, 04:23 AM
Originally posted by AnthonyMai
<snip>
I encourage people to look into the source code of STL and judge by themselves whether they want to use piece of code like that, or they are allowed to write code in that kind of style, after they have seen the STL source code for themselves. Lack of knowledge of STL internals is the most frequent cause of mis-use of STL by programmers.
Yes, it certainly would be a great gain for every C++ programmer to fully understand the way STL is implemented and works. Looking into the implementation of STL can teach you many things, but is a complicated task. I took a few hours to roam thru the source code of STLPort and the main achievement of that was realizing the fact that I'd need to spend much more time in order to get a grip on the implementation.
If reading and understanding the STL source code would be a demand prior beginning to use STL, many programmers, especially the beginners, would never come to actually use STL, being frightened away by the complexity of its implementation.
Also, we should think about the concept of separating the interface of a class (library) from its implementation. I'm not referring to the code organization here, but to the concept itself. I think you can use STL at a decent level by understanding only its interface and documentation (widely speaking: books, examples, articles). I think that the lack of understanding the interface and the concepts of STL is "the most frequent cause of mis-use of STL by programmers".
stober
July 24th, 2002, 07:08 AM
I have been writing C/C++ for over 30 years and I don't like STL at all because of its complexity -- I just don't understand it! I have used vector and string on rare occasions, but other than that I stay clear of all STL. I know STL is an attempt to make type-safe classes -- but why should I put myself through all that pain when non-STL classes and structures will do the job just as well -- and with a lot less effort on my part? I don't write code or libraries for other people to use, so the type-safe concept is usless to me.
Graham
July 24th, 2002, 07:24 AM
In contrast to that, I've been using STL in earnest for a couple of years now, and on those occasions when I can't use it (leagcy code that would break), I find myself yearning for its simplicity.
Paul McKenzie
July 24th, 2002, 09:00 AM
Originally posted by Graham
In contrast to that, I've been using STL in earnest for a couple of years now, and on those occasions when I can't use it (leagcy code that would break), I find myself yearning for its simplicity. Just the algorithms alone are worth it. And as far as it being too difficult, all it takes is to read example programs. If you can use vector or string, it's not much more in using a list or a deque.
If you want complexity, try "Modern C++ Design" by Andrei Alexandrescu.
Regards,
Paul McKenzie
Graham
July 24th, 2002, 09:24 AM
Originally posted by Paul McKenzie
If you want complexity, try "Modern C++ Design" by Andrei Alexandrescu.
Regards,
Paul McKenzie
Now that did blow my mind! (Good fun, though.)
AnthonyMai
July 24th, 2002, 10:36 AM
If all one wants is to produce some code that works correctly at all, STL would be a good and simple solution, and one doesn't need to look at the terrifying internal of STL. All you need is STL documentation and STL books. I guess that's what Paul et al has been promoting all the time. It is easier to produce correct code using STL as long as you comply witht he standard. I have no dispute of that.
In real life, code correct is far from good enough. A programmer who is satified with merely writting code that produce correct result is an unqualified programmer in today's IT job market. And, surprise, programmers ARE allowed to produce incorrect code now and then, as long as you know how to detect and fix bugs.
A piece of sorting code that produce the corret result maybe totally useless, if it takes days or weeks to sort something that could have been sorted in seconds.
To produce code that is more than merely correct, it is inevitable that one needs to know the internal of STL. Because, unfortunately the most performance rellevant piece of code are those inner loops and implementation detals. STL bites you really bad in performance if you do not know how it works underneath.
dude_1967
July 24th, 2002, 10:55 AM
skulkarni,
Implementations of LISP and FORTRAN77 use call-by-reference in order to pass input parameters to subroutines. A direct translation into C++ must be more carefully undertaken since your routines have large object-type input parameters.
This topic was discussed by Paul McKenzie. You must familiarize yourself with the calling-convention of C/C++, which used call-by-value unless otherwise specified by using an object reference or memory pointer.
Do not reject the STL on the basis of speed. In addition, if you are serious about writing speed-optimized programs, then try to familiarize yourself with the STL-implementation at hand. Reading the source-headers can be quite cumbersome. Sometimes an empirical analysis (code-profiling) can be of great insight if you find yourself repeatedly using various parts of the STL.
Regards, Chris.
:)
Paul McKenzie
July 24th, 2002, 11:21 AM
skulkarni is passing an object by value. Before STL came along, competent C++ programmers know that passing objects by value will cause temporary objects to be created. Joe Blow's brand-x class can have penalties if passed by value. This has absolutely nothing to do with the speed of STL, but the implementation of proper C++ coding when it comes to passing parameters.
Regards,
Paul McKenzie
AnthonyMai
July 24th, 2002, 01:12 PM
Before STL came along, competent C++ programmers know that passing objects by value will cause temporary objects to be created
That is absolutely right. However the fact is for some reason the author of STL decided to pass objects by value (especially it is UNKNOWN objects, since we are talking about templates) around every where within the STL, causing temporary objects to be created every where.
The point is the user of STL really don't have a choice when passing parameter to STL functions. Take std::sort() for example, the third parameter is a function object passed in by value.
You can't change that. You can't pass it in by reference when the template function tells you to pass object in by value. The only thing you can change is try to make your object as simple as possible. And when you fail to do that, Just like Paul's original "index sorting using std::sort()" sample code. The performance becomes so poor that a few seconds of sorting now takes days or weeks.
Paul McKenzie
July 24th, 2002, 01:26 PM
Originally posted by AnthonyMai
That is absolutely right. However the fact is for some reason the author of STL decided to pass objects by value (especially it is UNKNOWN objects, since we are talking about templates) around every where within the STL, causing temporary objects to be created every where...Umm...What does this have to do with the OP problem?
Regards,
Paul McKenzie
dude_1967
July 24th, 2002, 02:04 PM
The original post showed a clear mystery about run-time characteristics. Several thread-elements later the sample codes containing the translated algorithms were provided. It is clear from the posted codes that the subroutines force the compiler to pass large objects by value. Here one is talking about objects in memory which are bigger than the size of a few CPU-registers.
Paul has described what the compiler must do in order to achieve this: The compiler will create new objects (the vectors) for the subroutine call, dilligently copy the original passed vector into the newly created vector and feed the new one into the subroutine.
skulkarni:
The run-time characteristics of the translated algorithm are probably governed by the creation and copying of temporary objects passed to and possibly returned from subroutines.
There is no STL-issue here.
Redesign the C++ algorithms and make proper use of C++ references.
Sincerely,
Chris
:)
Bengi
July 24th, 2002, 05:13 PM
Optimize ur C linker,
and use inline ASM ;) ( _asm{} / asm{} )
:D
Philip Nicoletti
July 25th, 2002, 06:19 AM
What compiler and optimization level are you using ?
For example, with the version of g++ that I am using, the
default optimization level is "-O0". In some cases
compiling at "-O2" increased the speed by a factor of 10.
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.