CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 19
  1. #1
    Join Date
    Nov 2006
    Posts
    4

    A faster std::set?

    Hello,

    I need to considerably speed up the performance of std::set, maybe by reimplementing it as an array. Has anyone thought about this, or better yet, actually done it? In my application the memory allocations are too slow, so I thought that using an array would speed things up.

    Any advice in the area would be greatly appreciated!

    Thanks,
    m1cha3l

  2. #2
    Join Date
    Feb 2002
    Posts
    4,640

    Re: A faster std::set?

    Run your code through a profiler. That way, if the set is not the reason for the slowness, you won't waste time trying to re-implement it.

    FYI, std::set should be pretty fast.

    Viggy

  3. #3
    Join Date
    Apr 1999
    Posts
    27,449

    Re: A faster std::set?

    Quote Originally Posted by m1cha3l
    Hello,

    I need to considerably speed up the performance of std::set,
    The std::set is probably as fast as it will be.

    1) What values are you storing in the set?

    2) Did you determine that the problen is std::set? Did you profile the code to know exactly what is slow? Maybe the problem is how you use it, the type of data you're storing in it, etc.

    Please post sample code of what you're doing, and what you claim is slow, so we can also comment on the speed (or non-speed) of the set class. We can't just take your word for it, especially since the writers of your compiler's implementation of std::set are not incompetent.

    Unless the writers of the compiler's standard library are not very good, the chance of you writing a faster std::set than what is implemented for your compiler are remote at best.

    Regards,

    Paul McKenzie

  4. #4
    Join Date
    Nov 2002
    Location
    Foggy California
    Posts
    1,245

    Re: A faster std::set?

    As Paul said, std::set is about as fast as it can be. Profile first as Mr. Viggy said. Determine what the real bottleneck is. Is it:

    * Memory allocations
    * Element insertion and removal
    * Element lookup

    If memory allocations are really the bottleneck, look into providing a custom allocator (which std::set allows for by the way).

    If Element lookup is the bottleneck, you may look into SGI's hash_map class or std::tr1::unordered_set (if your compiler supports TR1). Boost has something too I believe. Mind you though that your elements will no longer be sorted within the container.

    If elelement insertion and removal is the bottleneck, then I would re-evalute my problem and ask around for advice (while mentioning the specifics of my profiling and application requirements). I'm not sure that std::set or any of the hashed containers are ideal when many insertions and removals will be performed.

    - Kevin
    Kevin Hall

  5. #5
    Join Date
    Feb 2005
    Location
    Normandy in France
    Posts
    4,590

    Re: A faster std::set?

    If you don't use all the features (such as element ordering) of std::set, you can use std::tr1::unordered_set or std::hash_set if your STL support it (the latter one is available in STLPort and a few other STL implementations).
    IIRC, I got a performance factor gain > 10 when I moved from std::map<std::string, ...> to STLPort std::hash_map<MyStringThatDoesntCopyTheData, ...>.
    "inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
    Club of lovers of the C++ typecasts cute syntax: Only recorded member.

    Out of memory happens! Handle it properly!
    Say no to g_new()!

  6. #6
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: A faster std::set?

    Quote Originally Posted by m1cha3l
    In my application the memory allocations are too slow, so I thought that using an array would speed things up.
    In addition to what has been said already, if you are using Visual Studio 2003 or newer, do not run your code from inside the IDE. The reason is that VS still debugs memory allocations even in Release mode when you run through the IDE and this makes the code appear slower.

    So:
    - Run your code from outside the IDE and make sure the problem persists
    - Run your code through a profiler
    - Then, depending on the results use Kevin Hall's advice about custom allocators and alternatives to std::set.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  7. #7
    Join Date
    Nov 2006
    Posts
    4

    Re: A faster std::set?

    1) What values are you storing in the set?

    I am storing range sets, and the set needs to be ordered. The code runs on Suse Linux 2.6.13-15.8.

    Here is what my class looks like (minus iterators, etc):

    #include <set>
    #include <boost/pool/pool_alloc.hpp>

    class range_set {
    public:
    struct value_type {
    typedef UInt64 scalar_type;
    scalar_type start, end;
    value_type() : start(0), end(0) {}
    value_type(scalar_type s, scalar_type e) : start(s), end(e) {}
    bool operator== (const value_type& other) {
    return start == other.start && end == other.end;
    }
    bool operator< (const value_type& other) const {
    return (start < other.start ? true : (start > other.start) ? false : end < other.end);
    }
    };

    private:
    typedef boost::fast_pool_allocator<value_type> PoolType;
    typedef std::set<value_type, std::less<value_type>, PoolType> range_set_type;

    range_set_type r_set;
    };

    This is an assignment I have been given ("make it faster!"). The requirement is to store large numbers of these range sets in memory. I need to do some profiling, too, but my manager seems to think std::set can be rewritten to make it more efficient.

    Thanks again for your advice.

  8. #8
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: A faster std::set?

    So you actually have a std::set<range_set> ? Are the ranges overlapping or not? If they are not overlapping, you can get better speed (and some more functionality) by using std::map<scalar_type,scalar_type> like in my article.

    Now from there on, it depends on how these range_sets are used. A common case is that they are constructed once at the start of the program and then only used for lookup. In this case, you can maybe completely replace them with static lookup tables (for example ICU uses this a lot). And this will be a lot faster (i.e. no time spent on construction, probably faster retrieval too).

    If they are very dynamic (i.e during execution ranges and range_sets are added/deleted) then this is obviously not a solution.

    This is an assignment I have been given ("make it faster!"). The requirement is to store large numbers of these range sets in memory. I need to do some profiling, too, but my manager seems to think std::set can be rewritten to make it more efficient.
    Most STL implementations use red-black trees to store sets. This is pretty efficient already for the general case. If your data distribution is a bit strange, changing to a different tree model may help, but it's not very probable that it changes a lot. Profiling is really a must here, since it's quite unlikely that you'd be able to improve on the speed of std::set in the general case.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  9. #9
    Join Date
    Nov 2002
    Location
    Foggy California
    Posts
    1,245

    Re: A faster std::set?

    Quote Originally Posted by m1cha3l
    This is an assignment I have been given ("make it faster!") ... but my manager seems to think std::set can be rewritten to make it more efficient.
    I'm sorry.
    Kevin Hall

  10. #10
    Join Date
    Nov 2006
    Posts
    4

    Re: A faster std::set?

    I did look at your range_set class, it's really very cool, but my ranges can overlap in several cases: the starting points can be the same, the ending points can be the same, or a range can be contained in another, where that start of one range can be less than another, and the end of the range either less than or greater than the other. I have a function that normalized the ranges so they don't overlap, but it is only called when writing the range set out to disk. While in memory the above conditions can be present.

    It looks like if I replace my range_set with yours, the overlaps will not occur as your code will detect them and fix the overlaps at insertion time. (Right?)

    The other thing is that I need not only range_set_iterator but also const_iterator, reverse_iterator, and const_reverse_iterator. I tried defining them for your class but soon got into trouble. Can you give me an example of how to define the additional iterators?

    Thanks again.
    Last edited by m1cha3l; November 14th, 2006 at 04:16 PM.

  11. #11
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: A faster std::set?

    Quote Originally Posted by m1cha3l
    It looks like if I replace my range_set with yours, the overlaps will not occur as your code will detect them and fix the overlaps at insertion time. (Right?)
    Yes, that's exactly what happens. The ranges are automatically normalised when calling insert or insert_range (or erase).

    The other thing is that I need not only range_set_iterator but also const_iterator, reverse_iterator, and const_reverse_iterator. I tried defining them for your class but soon got into trouble. Can you give me an example of how to define the additional iterators?
    Yeah I should have written those too. It's not really complicated but does take some time. For the const_iterator, you basically just have to make the * and -> operators return const objects and then ensure that you have a begin() and a begin() const inside the range_set which return the correct iterators (same with end() as well). The reverse_iterators are probably pretty much the same as the forward iterators, just that they use rbegin and rend internally and -- instead of ++ when iterating inside a range.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  12. #12
    Join Date
    Nov 2006
    Posts
    4

    Re: A faster std::set?

    Would it be best to define new template iterator classes like this, with maybe a base class to avoid duplication, or can all the iterators be defined in one class:

    template <class Key, class Compare = std::less<Key>, class Alloc = std::allocator<pair<Key, Key> > >
    class range_set_iterator
    {
    };

    template <class Key, class Compare = std::less<Key>, class Alloc = std::allocator<pair<Key, Key> > >
    class const_range_set_iterator
    {
    ...
    typedef typename REP_TYPE::const_iterator CONST_REP_TYPE_IT;
    ...
    };

    template <class Key, class Compare = std::less<Key>, class Alloc = std::allocator<pair<Key, Key> > >
    class reverse_range_set_iterator
    {
    };

    template <class Key, class Compare = std::less<Key>, class Alloc = std::allocator<pair<Key, Key> > >
    class const_reverse_range_set_iterator
    {
    };

    Then, in class range_set:

    public:
    typedef range_set_iterator<Key, Compare, Alloc> iterator;
    typedef const_range_set_iterator<Key, Compare, Alloc> const_iterator;
    typedef reverse_range_set_iterator<Key, Compare, Alloc> reverse_iterator;
    typedef const_reverse_range_set_iterator<Key, Compare, Alloc> const_reverse_iterator;

    Good practice in iterator definition!

  13. #13
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: A faster std::set?

    I would definitely start with separate classes, this makes it easier to find bugs and correct them.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  14. #14
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: A faster std::set?

    Actually I just realised that iterator and const_iterator could be the same class, really. This is because you cannot modify values in the range_set with its iterators anyways. This means that operator *() should be const in the original iterator class.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  15. #15
    Join Date
    Jan 2012
    Posts
    1

    Re: A faster std::set?

    One of the slowest parts of std::set is... allocating and deallocating of memory.
    When you use standard allocator it calls new and delete everytime you add/remove element, and that makes it really slow.

    I've written pool allocator, similar to boost::fast_pool_allocator but it frees memory when set/map object is cleared or destroyed.

    Here is the code:

    Code:
    #include <vector>
    
    #define PTR(x) *((T**)(&(x)))
    
    template <typename T>
    class bestAlloc
    {
    public:
        typedef T value_type;
    
        typedef value_type * pointer;
        typedef const value_type * const_pointer;
        typedef value_type & reference;
        typedef const value_type & const_reference;
    	typedef std::size_t size_type;
    	typedef std::ptrdiff_t difference_type;
    
        template <typename U>
        struct rebind
        {
    		typedef bestAlloc<U> other;
        };
    
    	class Block
    	{
    	public:
    		T* ptr;
    		size_type size;
    
    		Block(){};
    		Block(const Block &b)
    		{
    			ptr = b.ptr;
    			size = b.size;
    		}
    
    		Block& operator=(const Block &b)
    		{
    			ptr = b.ptr;
    			size = b.size;
    
    			return *this;
    		}
    
    		void Initialize()
    		{
    			for (size_type i=0; i<size-1; i++)
    			{
    				PTR(ptr[i]) = &ptr[i+1];
    			}
    
    			PTR(ptr[size-1]) = NULL;
    		};
    	};
    
    	std::vector<Block> m_vecBlocks;
    
    	T* firstFree;
    
    	size_type dSize;
    
    	size_type numAllocated;
    
    	void AddBlock()
    	{
    		Block b;
    		b.size = dSize;
    
    		if (dSize < 128*1024)dSize *= 4;
    
    		ASSERT(sizeof(T) >= sizeof(T*));
    
    		b.ptr = (T*)(new char[b.size * sizeof(T)]);
    
    		if (b.ptr == NULL)
    			throw std::bad_alloc();
    
    		b.Initialize(); // initialize list of pointers
    
    		PTR(b.ptr[b.size-1]) = firstFree;
    		firstFree = b.ptr;
    
    		m_vecBlocks.push_back(b);
    	}
    
    	T* malloc_(size_type n)
    	{
    		ASSERT(n == 1); // only single element malloc supported
    
    		if (firstFree == NULL)
    		{
    			AddBlock();
    		}
    
    		T* ret = firstFree;
    		firstFree = PTR(*firstFree);
    		
    		numAllocated++;
    
    		return ret;
    	}
    
    	void free_(void * const ptr, const size_type n)
    	{
    		ASSERT(n == 1);
    
    		if (ptr == NULL)return;
    
    		T* p = (T*)ptr;
    
    		PTR(*p) = firstFree;
    		firstFree = p;
    
    		numAllocated--;
    
    		if (numAllocated == 0)releaseMemory();
    	}
    
    	void releaseMemory()
    	{
    		for (size_type i=0; i<m_vecBlocks.size(); i++)
    		{
    			delete [] ((char*)m_vecBlocks[i].ptr);
    		}
    
    		m_vecBlocks.clear();
    
    		firstFree = NULL;
    		numAllocated = 0;
    		dSize = 8;
    	}
    
      public:
        bestAlloc()
        {
    		dSize = 8;
    		numAllocated = 0;
    		firstFree = NULL;
    		m_vecBlocks.clear();
        }
        
    	bestAlloc(const bestAlloc<T> &a)
    	{
    		dSize = 8;
    		numAllocated = 0;
    		firstFree = NULL;
    		m_vecBlocks.clear();
    	}
    private:
    	bestAlloc& operator=(const bestAlloc<T> &a)
    	{
    		dSize = 8;
    		numAllocated = 0;
    		firstFree = NULL;
    		m_vecBlocks.clear();
    
    		return *this;
    	}
    public:
        // not explicit, mimicking std::allocator [20.4.1]
        template <typename U>
        bestAlloc(const bestAlloc<U> &a)
        {
    		dSize = 8;
    		numAllocated = 0;
    		firstFree = NULL;
    		m_vecBlocks.clear();
        }
    
    	~bestAlloc()
    	{
    		releaseMemory();
    	}
    
        static pointer address(reference r)
        { return &r; }
        static const_pointer address(const_reference s)
        { return &s; }
        static size_type max_size()
        { return (std::numeric_limits<size_type>::max)(); }
        void construct(const pointer ptr, const value_type & t)
        { new (ptr) T(t); }
        void destroy(const pointer ptr)
        {
          ptr->~T();
          (void) ptr; // avoid unused variable warning
        }
    
    	// always different
        bool operator==(const bestAlloc &) const
        { return false; }
        bool operator!=(const bestAlloc &) const
        { return true; }
    
        pointer allocate(const size_type n)
        {
    		const pointer ret = malloc_(n);
    
    		if (ret == 0)
    			throw std::bad_alloc();
    
    		return ret;
        }
    
        pointer allocate(const size_type n, const void * const)
        { return allocate(n); }
    
        pointer allocate()
        {
    		const pointer ret = malloc_(1);
    
    		if (ret == 0)
    			throw std::bad_alloc();
    
    		return ret;
        }
    
        void deallocate(const pointer ptr, const size_type n)
        {
    		free_(ptr, n);
        }
    
        void deallocate(const pointer ptr)
        {
    		free_(ptr, 1);
        }
    };
    
    #undef PTR
    Using is very simple:

    std::set<int, std::less<int>, bestAlloc<int> > s;

    I hope you will find it useful

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured