CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6

Thread: Sorting issue

  1. #1
    Join Date
    May 2011
    Posts
    3

    Sorting issue

    Greetings to all. I am trying to make a quicksort that is faster than the qsort function from the library . So far, I've come to discover that if I choose the pivot at the middle of the sorting vector, my qsort beats the library qsort, but this depends on the size of the vector. At 100k elements, it's ok, but at 1kk elements, from 100 test cases, the library beast my version 100-0.
    So the problem is at larger vectors, and in choosing the proper pivot. Can somebody please give me some hints? I would very much appreciate it.
    I can provide the code I have so far if it's needed.

  2. #2
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: Sorting issue

    I have answered this question elsewhere.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  3. #3
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Sorting issue

    Quote Originally Posted by Ssalty View Post
    Greetings to all. I am trying to make a quicksort that is faster than the qsort function from the library.
    To add to laserlights answer to you at the other link:

    Color me skeptical, but if you're that confident you could code something better than a library implementor (probably PJ Plauger at DinkumWare), then I would think you wouldn't be asking questions here.

    A lot of work goes into making the standard library routines as efficient as possible. These aren't high-school coders writing these libraries -- some of these authors are the cream of the crop C and C++ programmers. Unless you know all the ins and outs of sorting, algorithms, etc. then it is doubtful what you have will beat the library implementation for most use cases.

    Regards,

    Paul McKenzie

  4. #4
    Join Date
    Nov 2002
    Location
    California
    Posts
    4,556

    Re: Sorting issue

    The performance of any quicksort implementation is highly dependent on the content of the data being sorted. You have discovered only one tip of this iceberg, as you have found that the selection of the pivot point from the middle makes a difference.

    Try it with different data, and you will probably find that your implementation is exceedingly slow for some data. Some of your test cases should include:
    1. Completely random data. This is probably the test case you are using now.
    2. Already-sorted data, both ascending and descending.
    3. Almost sorted data: as above except that a few items of data are aout-of-order (think about a user adding an arbitrary item to the end of a sorted list, and then requesting a re-sort)
    4. Highly redundant data. Think about a M/F sort based on gender. Or a "by state" sort of addresses. There will only be a few different categories (there are only 50 states), and most quicksort implementations perform quite poorly on such data.

    The performance of almost any quicksort algorithm can be improved simply by appending some random data to the sort item before the sort and removing it afterwards. For example, for a "by state" sort, append a random string of 5 character to the end of the 2-character state code, sort, and then delete the appended 5 characters. You would be amazed.

    Mike
    Last edited by MikeAThon; May 31st, 2011 at 02:23 PM. Reason: Changed "pre-pend" to "append". "Pre-pend" is wrong

  5. #5
    Join Date
    May 2011
    Posts
    3

    Re: Sorting issue

    Thanks for the response.
    I have tried to do a median of three version, but didn't got quite what I was looking for.
    What I have in mind is that for the size of the sorting vector, the pivot should be choose in a specific way, for example 100k elements a pivot that is in the middle is ok, but for the rest at this moment I am still investigating.
    As for the different data, I am generating them with random, and I test the same values for myqsort and the library version.

  6. #6
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: Sorting issue

    It is possible that std::sort() may be faster than qsort() due to better inlining. In general, however, it is extremely hard to do better than either of these.

    Radix sort may be faster on very large data sets.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured