-
March 17th, 2004, 05:33 PM
#1
std::vector vs. C-style arrays
I hope that the stl gurus can help me with this...
Consider the following: In my current project, a team member implemented code such as the following:
Code:
int* m_timeArray;
string* m_nameArray;
double* m_valueArray;
// Dynamically allocate space for the arrays here...
for(int i=0; i < numElements; i++)
{
m_timeArray[i] = someTime;
m_nameArray[i] = someName;
m_valueArray[i] = someValue;
}
As these C-style arrays are actually representing an array of objects (each consisting of a time, a name and a value, in this simplified example), I suggested to - create a class which has time, name and value as members, and
- use a std::vector instead of a C array.
Something along the lines of:
Code:
class Element
{
public:
Element(int time, const char* pName, double value);
virtual ~Element();
private:
int m_time;
string m_name;
double m_value;
};
vector<Element> elements;
Now this is a very performance critical part of the code, and the other team member (which is a hardcore C-programmer and performance fetishist anyway, who mistrusts stl and C++ in general) argued that it wouldn't be possible with this design to add elements to the array with the same efficiency. More specifically, code like
Code:
elements.push_back(Element(someTime, someName, someValue));
will of course create a temporary Element object and invoke Element's default copy constructor, which is obviously more costly than the C code. However, I made two bold statements saying that- when switching from a C array to std::vector, very little changes will be necessary to the existing code (which is not true, as I can't add elements to the array with something like elements[i] = ..., but I will have to use push_back - right?)
- That it is possible to use a std::vector of elements with the same efficiency as using multiple C arrays (like in the original code).
Here I'm stuck - it seems that I can't add elements to the vector without creating a temporary object, and therefore effectively copying the data twice. Any ideas?
-
March 17th, 2004, 05:43 PM
#2
Well...I am a little bit in a hurry here...so I will not go into all of the details now (pretty sure until I am back there have been others already providing more information).
In general, using any of the STL classes does not mean necessarely that the resulting code will perform slower than the old one. And even with 'vector' it is most of the time the other way round. One common mistake is to actually compare debug versions instead of release ones.
Furthermore, if e.g. a 'vector' performs much slower in a release build, then it is most-likely due to the fact, that the implementation (how the 'vector' is being used not the 'vector' implementation itself) is 'bad'. For example in your case, you should already reserve enough memory (-> 'numElements') for the 'vector' up-front, so that no additional memory allocation is necessary while doing a 'push_back()'.
For additional information you might want to take a look at the following introduction to the 'vector' class...
-
March 17th, 2004, 05:44 PM
#3
-
March 17th, 2004, 05:59 PM
#4
Re: std::vector vs. C-style arrays
Originally posted by etaoin
Now this is a very performance critical part of the code, and the other team member (which is a hardcore C-programmer and performance fetishist anyway, who mistrusts stl and C++ in general) argued that it wouldn't be possible with this design to add elements to the array with the same efficiency. More specifically, code like
Code:
elements.push_back(Element(someTime, someName, someValue));
will of course create a temporary Element object and invoke Element's default copy constructor, which is obviously more costly than the C code.
Your friend will be surprised as to which is faster.
However, I made two bold statements saying that[list=1][*]when switching from a C array to std::vector, very little changes will be necessary to the existing code (which is not true, as I can't add elements to the array with something like elements[i] = ..., but I will have to use push_back - right?)[*]
You can size the vector before adding any elements, and then use operator [] to fill the vector, or you can use push_back.
That it is possible to use a std::vector of elements with the same efficiency as using multiple C arrays (like in the original code).
Yes, if not more efficient. If this isn't the case, your implementation is bad or your compiler does not aggressively do optimizations.
One trick you can show your friend -- the reserve() function preallocates memory up front, so that vector doesn't have to do allocations each time you call push_back(). So if your friend is showing you code that calls push_back in a loop 10,000 times and says "ha ha, you lose", just add a call to reserve() before you call the loop to the push_backs, and watch them scramble to explain why all of a sudden, the vector has caught up, if not beaten their code.
As far as copying the data, isn't that what is being done in your friend's code when he invokes operator =?
Code:
for(int i=0; i < numElements; i++)
{
m_timeArray[i] = someTime;
//...
}
You are making a copy of someTime and storing it in m_timeArray. If not, then your friend is storing pointers, and is not comparing apples with apples. For a fair comparison, the vector should also store pointers.
Regards,
Paul McKenzie
-
March 17th, 2004, 06:02 PM
#5
Originally posted by Andreas Masur
In general, using any of the STL classes does not mean necessarely that the resulting code will perform slower than the old one. And even with 'vector' it is most of the time the other way round. One common mistake is to actually compare debug versions instead of release ones.
Furthermore, if e.g. a 'vector' performs much slower in a release build, then it is most-likely due to the fact, that the implementation (how the 'vector' is being used not the 'vector' implementation itself) is 'bad'. For example in your case, you should already reserve enough memory (-> 'numElements') for the 'vector' up-front, so that no additional memory allocation is necessary while doing a 'push_back()'.
Andreas, thanks for your reply.
I agree with what you say, and I already knew that. Maybe I didn't make my point clear enough, but my problem is not about the performance of a dynamically growing vector - it is obvious that adding elements beyond the current size of the vector will lead to a reallocation (which would be necessary with a C-array anyway).
Even when we reduce the problem to a vector of a pre-determined size - my question is, how can I initialize the elements without invoking the copy constructor and creating a temporary element? With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
Code:
Element* pElements = new Element[NUM_ELEMENTS];
for(int i = 0; i < NUM_ELEMENTS; i++)
{
pElements[i].m_time = someTime;
pElements[i].m_name = someName;
pElements[i].m_value = someValue;
}
This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code. But how can I do the same thing properly with a std::vector?
Last edited by etaoin; March 17th, 2004 at 06:04 PM.
-
March 17th, 2004, 06:07 PM
#6
Re: Re: std::vector vs. C-style arrays
Originally posted by Paul McKenzie
One trick you can show your friend -- the reserve() function preallocates memory up front, so that vector doesn't have to do allocations each time you call push_back().
Paul, thanks for your ideas too. However, as said in my reply to Andreas' post, the problem is not the dynamically growing array - I know about reserve() and how to use it. It is actually about how to initialize the vector with elements without creating temporary objects.
-
March 17th, 2004, 08:27 PM
#7
I'm not positive, but I think that you are implying that using
"Element* pElements = new Element[NUM_ELEMENTS];"
doesn't invoke the constructor. I think that it does. See
the following example :
Code:
#include <iostream>
#include <vector>
struct Element
{
Element() { std::cout << "in Element() constructor\n"; }
Element(const Element & ele) { std::cout << "in Element() copy-constructor\n"; };
// other stuff
};
int main()
{
std::cout << "create using C-style ...\n\n";
Element * arrElement = new Element[5];
std::cout << "\n\n\ncreate using vector ... \n\n";
std::vector<Element> vElement(5);
//
delete [] arrElement;
return 0;
}
You might try timing. Here is sample code (timer from YvesM)
Comment out the appropriate section in main ...
Code:
#include <windows.h>
#include <iostream>
#include <vector>
class CPrecisionTimer
{
public:
double Stop() {
LARGE_INTEGER curval;
QueryPerformanceCounter(&curval);
long double f = ToDouble(m_start);
long double f1 = ToDouble(curval);
long double f3 = ToDouble(m_freq);
return (f1 - f)/f3;
}
long double ToDouble(LARGE_INTEGER& val) {
long double f = val.u.HighPart;
f = f * (DWORD)-1;//4294967296;
f += val.u.LowPart;
return f;
}
void Start() {
QueryPerformanceCounter(&m_start);
}
CPrecisionTimer() {
QueryPerformanceFrequency(&m_freq);
}
private:
LARGE_INTEGER m_start;
LARGE_INTEGER m_freq;
};
using namespace std;
struct Element
{
Element() { i = 3; }
Element(const Element & ele) { i =3; };
// other stuff
int i;
};
const int N = 10000000;
int main()
{
CPrecisionTimer ct;
/*
std::cout << "create using C-style ...\n\n";
ct.Start();
Element * arrElement = new Element[N];
cout << "time creating C-style array = " << ct.Stop() << endl;
// just to make sure the compiler doesn't optimize everything away
for (int i=0; i<N; ++i)
if ( arrElement[i].i == 222) cout << "something\n";
delete [] arrElement;
*/
/*
std::cout << "create using vector ...\n\n";
ct.Start();
std::vector<Element> vElement(N);
cout << "time creating vector = " << ct.Stop() << endl;
// just to make sure the compiler doesn't optimize everything away
for (int i=0; i<N; ++i)
if ( vElement[i].i == 222) cout << "something\n";
*/
return 0;
}
-
March 17th, 2004, 08:28 PM
#8
From my understanding, as long as you vector::reserve(), the overhead for vector::push_back() is the same, if not faster, as assigning value into an array. When we vector::push_back(), the copy ctor is invoked whereas the operator=() is used for assignment in array. Since you are storing value in vector and array, I don't see any reason why temporary object is created. You can prove this by providing your copy ctor and operator=() in your class and track the number of time copy ctor and operator=() is invoked for each case. In fact, for the same no. of vector::push() and assignment in array, the no. of call to copy ctor is exactly the same as that of operator().
-
March 17th, 2004, 11:19 PM
#9
Originally posted by etaoin
With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
Code:
Element* pElements = new Element[NUM_ELEMENTS];
This calls the default constructor for NUM_ELEMENTS Element objects.
Code:
for(int i = 0; i < NUM_ELEMENTS; i++)
{
pElements[i].m_time = someTime;
pElements[i].m_name = someName;
pElements[i].m_value = someValue;
}
This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code.
The operator = is invoked for the m_time, m_name, and m_value types. It still is making a copy (actually an assignment, but in effect, a copy). As Philip shows, the operator = is not so benign in C++ as it is in C.
Here is an updated version that shows that assignment is done:
Code:
#include <iostream>
#include <vector>
struct Element
{
Element() { std::cout << "in Element() constructor\n"; }
Element(const Element & ele) { std::cout << "in Element() copy-constructor\n"; };
Element& operator =(const Element& ele)
{
std::cout << "in assignment" << std::endl;
return *this;
}
// other stuff
};
int main()
{
std::cout << "create using C-style ...\n\n";
Element * arrElement = new Element[5];
Element el;
int i;
for ( i = 0; i < 5; ++i )
arrElement[i] = el;
std::cout << "\n\n\ncreate using vector ... \n\n";
std::vector<Element> vElement(5);
for ( i = 0; i < 5; ++i )
vElement[i] = el;
//
delete [] arrElement;
return 0;
}
Output:
create using C-style ...
in Element() constructor
in Element() constructor
in Element() constructor
in Element() constructor
in Element() constructor
in Element() constructor
in assignment
in assignment
in assignment
in assignment
in assignment
create using vector ...
in Element() constructor
in Element() copy-constructor
in Element() copy-constructor
in Element() copy-constructor
in Element() copy-constructor
in Element() copy-constructor
in assignment
in assignment
in assignment
in assignment
in assignment
No difference in the number of times constructors and assignements are called.
Regards,
Paul McKenzie
-
March 17th, 2004, 11:27 PM
#10
With Yves code, if you create the vector first, it beats the array.
It would be better if there were two seperate programs, one for vector and one for the array, so that the heap state doesn't factor into the timings.
Regards,
Paul McKenzie
-
March 18th, 2004, 11:29 PM
#11
Originally posted by etaoin
Andreas, thanks for your reply.
Even when we reduce the problem to a vector of a pre-determined size - my question is, how can I initialize the elements without invoking the copy constructor and creating a temporary element? With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
Code:
Element* pElements = new Element[NUM_ELEMENTS];
for(int i = 0; i < NUM_ELEMENTS; i++)
{
pElements[i].m_time = someTime;
pElements[i].m_name = someName;
pElements[i].m_value = someValue;
}
This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code. But how can I do the same thing properly with a std::vector?
You can do this with a vector object just as easy.
Example:
PHP Code:
std::vector<Element> pElements(NUM_ELEMENTS);
for(int i = 0; i < NUM_ELEMENTS; ++i)
{
pElements[i].m_time = someTime;
pElements[i].m_name = someName;
pElements[i].m_value = someValue;
}
Using the same method, you can access the same members and assign the values just as you would with the C-Style array.
Moreover, if you use iterators, you can make this same code faster using the vector method.
PHP Code:
std::vector<Element> pElements(NUM_ELEMENTS);
for(std::vector<Element>::iterator i = pElements.begin(), e = pElements.end();
i!=e;++i)
{
i->m_time = someTime;
i->m_name = someName;
i->m_value = someValue;
}
The above version using iterators will have a much better performance then you're original code using a C-Style array.
Vector iterators have a faster access time then does index access to a C-Style array.
Almost anything you can do with a C-Style array, you can do with a vector.
-
March 19th, 2004, 09:14 AM
#12
Since etaoin hasn't responded, I think we've given enough information (more likely, ammunition to tell the co-workers that they don't know what they're talking about).
Regards,
Paul McKenzie
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|