std::vector vs. C-style arrays
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 12 of 12

Thread: std::vector vs. C-style arrays

  1. #1
    Join Date
    Dec 2003
    Posts
    23

    Question std::vector vs. C-style arrays

    I hope that the stl gurus can help me with this...

    Consider the following: In my current project, a team member implemented code such as the following:
    Code:
    int*    m_timeArray;
    string* m_nameArray;
    double* m_valueArray;
    
    // Dynamically allocate space for the arrays here...
    
    for(int i=0; i < numElements; i++)
    {
      m_timeArray[i] = someTime;
      m_nameArray[i] = someName;
      m_valueArray[i] = someValue;
    }
    As these C-style arrays are actually representing an array of objects (each consisting of a time, a name and a value, in this simplified example), I suggested to
    • create a class which has time, name and value as members, and
    • use a std::vector instead of a C array.

    Something along the lines of:
    Code:
    class Element
    {
    public:
      Element(int time, const char* pName, double value);
      virtual ~Element();
    
    private:
      int    m_time;
      string m_name;
      double m_value;
    };
    
    vector<Element> elements;
    Now this is a very performance critical part of the code, and the other team member (which is a hardcore C-programmer and performance fetishist anyway, who mistrusts stl and C++ in general) argued that it wouldn't be possible with this design to add elements to the array with the same efficiency. More specifically, code like
    Code:
    elements.push_back(Element(someTime, someName, someValue));
    will of course create a temporary Element object and invoke Element's default copy constructor, which is obviously more costly than the C code. However, I made two bold statements saying that
    1. when switching from a C array to std::vector, very little changes will be necessary to the existing code (which is not true, as I can't add elements to the array with something like elements[i] = ..., but I will have to use push_back - right?)
    2. That it is possible to use a std::vector of elements with the same efficiency as using multiple C arrays (like in the original code).

    Here I'm stuck - it seems that I can't add elements to the vector without creating a temporary object, and therefore effectively copying the data twice. Any ideas?

  2. #2
    Join Date
    May 2000
    Location
    KY, USA
    Posts
    18,652
    Well...I am a little bit in a hurry here...so I will not go into all of the details now (pretty sure until I am back there have been others already providing more information).

    In general, using any of the STL classes does not mean necessarely that the resulting code will perform slower than the old one. And even with 'vector' it is most of the time the other way round. One common mistake is to actually compare debug versions instead of release ones.

    Furthermore, if e.g. a 'vector' performs much slower in a release build, then it is most-likely due to the fact, that the implementation (how the 'vector' is being used not the 'vector' implementation itself) is 'bad'. For example in your case, you should already reserve enough memory (-> 'numElements') for the 'vector' up-front, so that no additional memory allocation is necessary while doing a 'push_back()'.

    For additional information you might want to take a look at the following introduction to the 'vector' class...
    Ciao, Andreas

    "Software is like sex, it's better when it's free." - Linus Torvalds


    Article(s): Allocators (STL) Function Objects (STL)

  3. #3
    Join Date
    May 2000
    Location
    KY, USA
    Posts
    18,652
    [Moved thread]
    Ciao, Andreas

    "Software is like sex, it's better when it's free." - Linus Torvalds


    Article(s): Allocators (STL) Function Objects (STL)

  4. #4
    Join Date
    Apr 1999
    Posts
    27,434

    Re: std::vector vs. C-style arrays

    Originally posted by etaoin
    Now this is a very performance critical part of the code, and the other team member (which is a hardcore C-programmer and performance fetishist anyway, who mistrusts stl and C++ in general) argued that it wouldn't be possible with this design to add elements to the array with the same efficiency. More specifically, code like
    Code:
    elements.push_back(Element(someTime, someName, someValue));
    will of course create a temporary Element object and invoke Element's default copy constructor, which is obviously more costly than the C code.
    Your friend will be surprised as to which is faster.
    However, I made two bold statements saying that[list=1][*]when switching from a C array to std::vector, very little changes will be necessary to the existing code (which is not true, as I can't add elements to the array with something like elements[i] = ..., but I will have to use push_back - right?)[*]
    You can size the vector before adding any elements, and then use operator [] to fill the vector, or you can use push_back.
    That it is possible to use a std::vector of elements with the same efficiency as using multiple C arrays (like in the original code).
    Yes, if not more efficient. If this isn't the case, your implementation is bad or your compiler does not aggressively do optimizations.

    One trick you can show your friend -- the reserve() function preallocates memory up front, so that vector doesn't have to do allocations each time you call push_back(). So if your friend is showing you code that calls push_back in a loop 10,000 times and says "ha ha, you lose", just add a call to reserve() before you call the loop to the push_backs, and watch them scramble to explain why all of a sudden, the vector has caught up, if not beaten their code.

    As far as copying the data, isn't that what is being done in your friend's code when he invokes operator =?
    Code:
    for(int i=0; i < numElements; i++)
    {
      m_timeArray[i] = someTime;
      //...
    }
    You are making a copy of someTime and storing it in m_timeArray. If not, then your friend is storing pointers, and is not comparing apples with apples. For a fair comparison, the vector should also store pointers.

    Regards,

    Paul McKenzie

  5. #5
    Join Date
    Dec 2003
    Posts
    23
    Originally posted by Andreas Masur
    In general, using any of the STL classes does not mean necessarely that the resulting code will perform slower than the old one. And even with 'vector' it is most of the time the other way round. One common mistake is to actually compare debug versions instead of release ones.

    Furthermore, if e.g. a 'vector' performs much slower in a release build, then it is most-likely due to the fact, that the implementation (how the 'vector' is being used not the 'vector' implementation itself) is 'bad'. For example in your case, you should already reserve enough memory (-> 'numElements') for the 'vector' up-front, so that no additional memory allocation is necessary while doing a 'push_back()'.
    Andreas, thanks for your reply.

    I agree with what you say, and I already knew that. Maybe I didn't make my point clear enough, but my problem is not about the performance of a dynamically growing vector - it is obvious that adding elements beyond the current size of the vector will lead to a reallocation (which would be necessary with a C-array anyway).

    Even when we reduce the problem to a vector of a pre-determined size - my question is, how can I initialize the elements without invoking the copy constructor and creating a temporary element? With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
    Code:
    Element* pElements = new Element[NUM_ELEMENTS];
    
    for(int i = 0; i < NUM_ELEMENTS; i++)
    {
      pElements[i].m_time = someTime;
      pElements[i].m_name = someName;
      pElements[i].m_value = someValue;
    }
    This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code. But how can I do the same thing properly with a std::vector?
    Last edited by etaoin; March 17th, 2004 at 05:04 PM.

  6. #6
    Join Date
    Dec 2003
    Posts
    23

    Re: Re: std::vector vs. C-style arrays

    Originally posted by Paul McKenzie
    One trick you can show your friend -- the reserve() function preallocates memory up front, so that vector doesn't have to do allocations each time you call push_back().
    Paul, thanks for your ideas too. However, as said in my reply to Andreas' post, the problem is not the dynamically growing array - I know about reserve() and how to use it. It is actually about how to initialize the vector with elements without creating temporary objects.

  7. #7
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,622
    I'm not positive, but I think that you are implying that using
    "Element* pElements = new Element[NUM_ELEMENTS];"
    doesn't invoke the constructor. I think that it does. See
    the following example :

    Code:
    #include <iostream>
    #include <vector>
    
    
    struct Element
    {
    	Element() { std::cout << "in Element() constructor\n"; }
    	Element(const Element & ele) { std::cout << "in Element() copy-constructor\n"; };
    
    	// other stuff
    };
    
    
    int main()
    {
    	std::cout << "create using C-style ...\n\n";
    	Element * arrElement = new Element[5];
    
    	std::cout << "\n\n\ncreate using vector ... \n\n";
    	std::vector<Element> vElement(5);
    
    	//
    
    	delete [] arrElement;
    
    	return 0;
    }
    You might try timing. Here is sample code (timer from YvesM)
    Comment out the appropriate section in main ...

    Code:
    #include <windows.h>
    
    #include <iostream>
    #include <vector>
    
    
    class CPrecisionTimer  
    {
    public:
        double Stop() {
            LARGE_INTEGER curval;
            QueryPerformanceCounter(&curval);
            long double f = ToDouble(m_start);
            long double f1 = ToDouble(curval);
            long double f3 = ToDouble(m_freq);
            return (f1 - f)/f3;
        }
        long double ToDouble(LARGE_INTEGER& val) {
            long double f = val.u.HighPart;
            f = f * (DWORD)-1;//4294967296;
            f += val.u.LowPart;
            return f;
        }
        void Start() {
            QueryPerformanceCounter(&m_start);
        }
        CPrecisionTimer() {
            QueryPerformanceFrequency(&m_freq);
        }
    private:
        LARGE_INTEGER m_start;
        LARGE_INTEGER m_freq;
    };
    
    
    using namespace std;
    
    
    struct Element
    {
        Element() { i = 3; }
        Element(const Element & ele) { i =3; };
    
        // other stuff
    
        int i;
    };
    
    const int N = 10000000;
    
    int main()
    {
        CPrecisionTimer ct;
    
        /*
    
        std::cout << "create using C-style ...\n\n";
    
        ct.Start();
        Element * arrElement = new Element[N];
    
        cout << "time creating C-style array = " << ct.Stop() << endl;
    
        // just to make sure the compiler doesn't optimize everything away
        for (int i=0; i<N; ++i)
            if ( arrElement[i].i == 222) cout << "something\n";
    
        delete [] arrElement;
    
        */
    
        /*
    
        std::cout << "create using vector ...\n\n";
    
        ct.Start();
    
        std::vector<Element> vElement(N);
    
        cout << "time creating vector = " << ct.Stop() << endl;
    
        // just to make sure the compiler doesn't optimize everything away
        for (int i=0; i<N; ++i)
            if ( vElement[i].i == 222) cout << "something\n";
    
       */
    
        return 0;
    }

  8. #8
    Join Date
    Oct 2002
    Location
    Singapore
    Posts
    3,128
    From my understanding, as long as you vector::reserve(), the overhead for vector::push_back() is the same, if not faster, as assigning value into an array. When we vector::push_back(), the copy ctor is invoked whereas the operator=() is used for assignment in array. Since you are storing value in vector and array, I don't see any reason why temporary object is created. You can prove this by providing your copy ctor and operator=() in your class and track the number of time copy ctor and operator=() is invoked for each case. In fact, for the same no. of vector::push() and assignment in array, the no. of call to copy ctor is exactly the same as that of operator().

  9. #9
    Join Date
    Apr 1999
    Posts
    27,434
    Originally posted by etaoin
    With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
    Code:
    Element* pElements = new Element[NUM_ELEMENTS];
    This calls the default constructor for NUM_ELEMENTS Element objects.
    Code:
    for(int i = 0; i < NUM_ELEMENTS; i++)
    {
      pElements[i].m_time = someTime;
      pElements[i].m_name = someName;
      pElements[i].m_value = someValue;
    }
    This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code.
    The operator = is invoked for the m_time, m_name, and m_value types. It still is making a copy (actually an assignment, but in effect, a copy). As Philip shows, the operator = is not so benign in C++ as it is in C.

    Here is an updated version that shows that assignment is done:
    Code:
    #include <iostream>
    #include <vector>
    
    
    struct Element
    {
    	Element() { std::cout << "in Element() constructor\n"; }
    	Element(const Element & ele) { std::cout << "in Element() copy-constructor\n"; };
        Element& operator =(const Element& ele) 
        {
             std::cout << "in assignment" << std::endl; 
             return *this;
        }
    
    	// other stuff
    };
    
    
    int main()
    {
    	std::cout << "create using C-style ...\n\n";
    	Element * arrElement = new Element[5];
            
        Element el;
        int i;
        for (  i = 0; i < 5; ++i )
           arrElement[i] = el;
    
    	std::cout << "\n\n\ncreate using vector ... \n\n";
    	std::vector<Element> vElement(5);
        for (  i = 0; i < 5; ++i )
           vElement[i] = el;
    
    	//
    
    	delete [] arrElement;
    	return 0;
    }
    
    Output:
    
    create using C-style ...
    
    in Element() constructor
    in Element() constructor
    in Element() constructor
    in Element() constructor
    in Element() constructor
    in Element() constructor
    in assignment
    in assignment
    in assignment
    in assignment
    in assignment
    
    
    
    create using vector ...
    
    in Element() constructor
    in Element() copy-constructor
    in Element() copy-constructor
    in Element() copy-constructor
    in Element() copy-constructor
    in Element() copy-constructor
    in assignment
    in assignment
    in assignment
    in assignment
    in assignment
    No difference in the number of times constructors and assignements are called.

    Regards,

    Paul McKenzie

  10. #10
    Join Date
    Apr 1999
    Posts
    27,434
    With Yves code, if you create the vector first, it beats the array.

    It would be better if there were two seperate programs, one for vector and one for the array, so that the heap state doesn't factor into the timings.

    Regards,

    Paul McKenzie

  11. #11
    Join Date
    Aug 2000
    Location
    New Jersey
    Posts
    968
    Originally posted by etaoin
    Andreas, thanks for your reply.

    Even when we reduce the problem to a vector of a pre-determined size - my question is, how can I initialize the elements without invoking the copy constructor and creating a temporary element? With a C array, we can do the following (assuming, for simplicity, that the members of Element were public):
    Code:
    Element* pElements = new Element[NUM_ELEMENTS];
    
    for(int i = 0; i < NUM_ELEMENTS; i++)
    {
      pElements[i].m_time = someTime;
      pElements[i].m_name = someName;
      pElements[i].m_value = someValue;
    }
    This will simply assign the values once to each element in the array, leading to no overhead as compared to the original code. But how can I do the same thing properly with a std::vector?
    You can do this with a vector object just as easy.
    Example:
    PHP Code:
    std::vector<ElementpElements(NUM_ELEMENTS);

    for(
    int i 0NUM_ELEMENTS; ++i)
    {
      
    pElements[i].m_time someTime;
      
    pElements[i].m_name someName;
      
    pElements[i].m_value someValue;

    Using the same method, you can access the same members and assign the values just as you would with the C-Style array.

    Moreover, if you use iterators, you can make this same code faster using the vector method.

    PHP Code:
    std::vector<ElementpElements(NUM_ELEMENTS);

    for(
    std::vector<Element>::iterator i pElements.begin(), pElements.end();
            
    i!=e;++i)
    {
      
    i->m_time someTime;
      
    i->m_name someName;
      
    i->m_value someValue;

    The above version using iterators will have a much better performance then you're original code using a C-Style array.

    Vector iterators have a faster access time then does index access to a C-Style array.

    Almost anything you can do with a C-Style array, you can do with a vector.
    David Maisonave
    Author of Policy Based Synchronized Smart Pointer
    http://axter.com/smartptr


    Top ten member of C++ Expert Exchange.
    C++ Topic Area

  12. #12
    Join Date
    Apr 1999
    Posts
    27,434
    Since etaoin hasn't responded, I think we've given enough information (more likely, ammunition to tell the co-workers that they don't know what they're talking about).

    Regards,

    Paul McKenzie

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center