dcsimg
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 9 of 9

Thread: Benchmark my string class?

  1. #1
    Join Date
    Jan 2009
    Posts
    1,689

    Red face Benchmark my string class?

    A while ago someone mentioned that sometimes it's good to have a C point of view in a C++ forum and it got me wondering what would happen if I created a string class using some C string optimizations use that sSTL string don't use. So I wrote a minimal one to throw into one of my parsing engines and was surprised by the amount of speed I got out of it with very little optimization. It's not for every day string use, more for parsing and things that use lots of substrs, because this class does substr in O(1) instead of O(n). It's also not thread safe. I was just wondering if I couldn't get some of you to do a quick benchmark for me. Perhaps if the performance is good across a variety of platforms I'll implement the rest of the interface and put it on sourceforge.

    Just do
    Code:
    g++ main.cpp -O3 -DNDEBUG
    on the directory, then an ./a.out. Then just post the bottom half of the output, this part:

    Code:
    Benchmark           STD String  String      String Wins
    = operator char*    30000       1           Yes (3000000% faster)
    = operator string   120000      1           Yes (12000000% faster)
    Copy ctor           10000       1           Yes (1000000% faster)
    Copy ctor + append  200000      210000      No  (104% slower)
    Copy ctor with num  110000      1           Yes (11000000% faster)
    Ctor const char*    140000      170000      No  (121% slower)
    Ctor const char* n  90000       110000      No  (122% slower)
    Ctor empty          1           1           Equal
    Ctor n, c           90000       110000      No  (122% slower)
    Substr              210000      1           Yes (21000000% faster)
    Substr via ctor     220000      190000      Yes (115% faster)
    append char         820000      480000      Yes (170% faster)
    append char*        1680000     560000      Yes (300% faster)
    append string       1590000     1080000     Yes (147% faster)
    swap                1           1           Equal
    Obviously those huge percentages aren't accurate, that's a fault of the precision of the internal timer. My compiler seems to have a precision of millseconds for clock(), the 1s are just too fast to get a reading. I get more precise results on my mac.
    Attached Files Attached Files
    Last edited by ninja9578; February 15th, 2011 at 07:46 AM.

  2. #2
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: Benchmark my string class?

    It isn't surprising that a non-copying substr() operation would be much faster. The trouble is how you make that design robust. Copy-on-write has been explored; it's got thread safety concerns which are difficult to solve. Another approach would be to make a special type which represents a partial string, that can be converted to a normal string via a copy operation. This is the sort of thing I'd expect to work best.
    Last edited by Lindley; February 14th, 2011 at 04:21 PM.

  3. #3
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: Benchmark my string class?

    That would be me.

    So... Is there a main.cpp attached somewhere?
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  4. #4
    Join Date
    Jan 2009
    Posts
    1,689

    Re: Benchmark my string class?

    Opps, silly me. I always forget to attach :P

  5. #5
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: Benchmark my string class?

    My machine is running a big batch, so you'll have to wait for actual numbers.

    My 2 cents:
    -Include <ctime> or <time.h>. My compiler had no idea what clock_t was.
    -Remove 100% from your percentages, as your reported "150%" faster is actually only 50% faster. Same for slower:

    Code:
    		  cout << "Yes (" << (int)res - 100 << "% faster)";
    		  cout << "No  (" << (int)res - 100 << "% slower)";
    That aside, you code looks very good, and so does the performance. I'm not sure I see the "C developer way" of doing this, but I sure see robust low level development.

    I've checked out my GCC's implementation of string more than once. There are a lot of mentions of reference counts, but try as I might, I don't see where they use it (none of their implementations are o(1)). It's as if they decided to keep the worst of both worlds :/
    I'm not sure what to make of it.

    I'd be interested to see what happens in a non-synthetic test case. It is actually very rare to create a copy (or substring), without wanting to modify it afterwards. The only case I can think of is for return by value, and R-value references surpass reference counting on that one.

    BTW, std::string does support shallow substr through iterators. I personally find using iterators more natural than the more C-like offset+size functions.

    Speaking of which: I'd be tempted to ask you to benchmark your iterators. Iterators are the foundation of C++'s STL, and not testing them is like skipping half the tests. How fast does sorting the elements in your string run? That said, your iterators aren't correctly implemented as they assume the underlying string is not shared:

    Code:
    string a="hello";
    string b=a; //b is a shallow copy of a
    string::iterator it = b.begin(); //b is made into a deep copy, so in theory, we can use it...
    string c = b; //Wait, b is now shared
    *it = 'y'; //Oh-oh...
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  6. #6
    Join Date
    Nov 2003
    Location
    Belgium
    Posts
    8,153

    Re: Benchmark my string class?

    Running with VC++2010:
    Code:
    Benchmark           STD String  String      String Wins
    = operator char*    17          13          Yes (130% faster)
    = operator string   33          5           Yes (660% faster)
    Copy ctor           172         5           Yes (3440% faster)
    Copy ctor + append  388         331         Yes (117% faster)
    Copy ctor with num  153         5           Yes (3060% faster)
    Ctor const char*    177         267         No  (150% slower)
    Ctor const char* n  137         232         No  (169% slower)
    Ctor empty          1           3           No  (300% slower)
    Ctor n, c           135         227         No  (168% slower)
    Substr              55          13          Yes (423% faster)
    Substr via ctor     1018        276         Yes (368% faster)
    append char         589         1292        No  (219% slower)
    append char*        1122        1458        No  (129% slower)
    append string       1989        1669        Yes (119% faster)
    swap                3           2           Yes (150% faster)
    Marc Gregoire - NuonSoft (http://www.nuonsoft.com)
    My Blog
    Wallpaper Cycler 3.5.0.97

    Author of Professional C++, 4th Edition by Wiley/Wrox (includes C++17 features)
    ISBN: 978-1-119-42130-6
    [ http://www.facebook.com/professionalcpp ]

  7. #7
    Join Date
    Jan 2009
    Posts
    1,689

    Re: Benchmark my string class?

    Quote Originally Posted by monarch_dodra View Post
    Code:
    string a="hello";
    string b=a; //b is a shallow copy of a
    string::iterator it = b.begin(); //b is made into a deep copy, so in theory, we can use it...
    string c = b; //Wait, b is now shared
    *it = 'y'; //Oh-oh...
    Hmm... that will be a tricky one to solve :/ I wonder how gcc does it.

  8. #8
    Join Date
    Jan 2009
    Posts
    1,689

    Re: Benchmark my string class?

    Quote Originally Posted by monarch_dodra View Post
    I've checked out my GCC's implementation of string more than once. There are a lot of mentions of reference counts, but try as I might, I don't see where they use it (none of their implementations are o(1)). It's as if they decided to keep the worst of both worlds :/
    I'm not sure what to make of it.
    Depends on the platform, strings on OSX use reference counting, or something like that. If you make a string, copy it, then print out the values of both c_strs. They are equal, so they are obviously sharing some memory.

    BTW, std::string does support shallow substr through iterators. I personally find using iterators more natural than the more C-like offset+size functions.
    Then shouldn't the Substr via ctor be significantly faster than substr in my benchmark?

  9. #9
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: Benchmark my string class?

    Quote Originally Posted by ninja9578 View Post
    If you make a string, copy it, then print out the values of both c_strs. They are equal, so they are obviously sharing some memory.
    I guess so. Looks like I have been mislead.

    Then shouldn't the Substr via ctor be significantly faster than substr in my benchmark?[/QUOTE]

    No, I just mean that if you want a shallow sub-string, you can just get it by having two iterators pointing at the end of each string. You aren't building an actual shallow substring though.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width




On-Demand Webinars (sponsored)