Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
As far as comparing... Do you see a LISTBOX that is implemented using std::vector that is capable of that?
That is your job -- tell us where to find a listbox that is implemented using a vector so that we can make a comparison. No one is going to go around searching source code for such an animal -- you provide it to us and we'll comment on it.
Quote:
Alright, case closed. I am not one to show source code out of my closed source projects
Do you have a vector class or not? If you do, then why not show us a simple program showing your vector implementation with std::vector? Why do you need to show us a list box program to compare timings between two components?
Regards,
Paul McKenzie
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
First of all... Each thread get's it's OWN STACK, therefore you cannot pass sometime that has anything in that thread's stack to another. Passing by value is different (a 32bit integer passed by value reinterpret_cast to the void pointer, that's different - Stop being a smart arse).
Not being a "smart args".
1) Passing be Value an item on one threads stack to another thread (fo storage on it's stack) is perfrectly reasonable. There is no need for using the heap.
2) Even passing be reference, can make sense. Consider the following psuedo code:
Code:
void Func()
{
SomeObject x[100]; // local on the heap
foreach (x)
StartThread(&x);
WaitForAllThreads();
}
Quote:
Originally Posted by JamesSchumacher
As far as comparing... Do you see a LISTBOX that is implemented using std::vector that is capable of that?
Nor should thre be any for two reasons:
1) a ListBox is meant to be VIEWABLE. The average UI Guidelines indicate that a ListBOx should never have more than 25-100 items (most on the lower end). Even wasing a few hundred bytes per item should not have an impact on modern (non-mobile/non-embedded) systems.
2) Any Linear access mechanism with 10M items is likely to be a poor design. Can you provide a real world sample where this would be the implementation of choice?
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by TheCPUWizard
Not being a "smart args".
1) Passing be Value an item on one threads stack to another thread (fo storage on it's stack) is perfrectly reasonable. There is no need for using the heap.
2) Even passing be reference, can make sense. Consider the following psuedo code:
Code:
void Func()
{
SomeObject x[100]; // local on the heap
foreach (x)
StartThread(&x);
WaitForAllThreads();
}
Nor should thre be any for two reasons:
1) a ListBox is meant to be VIEWABLE. The average UI Guidelines indicate that a ListBOx should never have more than 25-100 items (most on the lower end). Even wasing a few hundred bytes per item should not have an impact on modern (non-mobile/non-embedded) systems.
2) Any Linear access mechanism with 10M items is likely to be a poor design. Can you provide a real world sample where this would be the implementation of choice?
In response to #2, Linear access is actually the best design (this applies to #1 as well), simply for the fact that based on scroll position, one can determine the first item to be drawn simply by the division of the current scroll position (max defined as item count * item height - window height or zero in case of item count * item height < window height) by the item height.
Item height = 15
10000 items with window height of 200.
150000 - 200 is max scroll position.
149800 / 15 = 9986
I wonder how I determine where to begin? This is an example if scroll position is the maximum. Drawing is independent of how many items are in the listbox. If it was the 64-bit world, and it was using 64-bit scroll data, as far as the window is concerned, it wouldn't matter if it had 1,000,000,000 items in it. (The actual draw code that is)
However, for selected items, a linear buffer is NOT the best choice. In fact, I am optimizing the code to use a CListItemNode which takes a pointer to a CListItem in the constructor, in this manner, I do not need to know how many selected items there are in the case of multi-select.
And actually, a few hundred bytes per item does make a difference. 20 bytes (which I just dropped off, removing color data storage from the list items themselves) at 10,000,000 items saved 200,000,000 bytes. What is 200,000,000 / 1,048,576?
Anyways... If you want to test out my control in a test app (no source or development access, just the same test program I used. Check it out)
That can be found here in the self extractor along with my core *.DLL it uses.
And note that when you click the button, it adds 250,000 random 5 character strings into the list box. (That takes a bit due to generating the 250,000 strings)
So what point does this have? It goes back to the default allocator for std::vector as well. :cool:
If you use a specialized vector, that's not std::vector is it? That's your own rolled specialization. So why use vector at all?
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
In response to #2, Linear access is actually the best design (this applies to #1 as well), simply for the fact that based on scroll position, one can determine the first item to be drawn simply by the division of the current scroll position (max defined as item count * item height - window height or zero in case of item count * item height < window height) by the item height.
That wasn't the point. The point is that:
- No one wants to deal with a list that has substantially more than 25-100 items.
- If they do, it had BETTER have search capabilities
And here we are still talking about list boxes. What about the container it uses, that you claim is comparable but superior to vector (i.e. a simple, contiguous, random access container)? Or have you built memory management right into your list box?
Quote:
Originally Posted by JamesSchumacher
So what point does this have? It goes back to the default allocator for std::vector as well. :cool:
If you use a specialized vector, that's not std::vector is it? That's your own rolled specialization. So why use vector at all?
The default allocator in STL is essentially new([])/delete([]), and vectors are almost always mentioned in the context of replacing new[]/delete[]. What could possibly be your point, then?
Secondly, there is a huge difference between implementing a custom allocator and re-implementing vector. Not to mention that a custom allocator, if made properly, could be used by any type of container.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by Hermit
That wasn't the point. The point is that:
- No one wants to deal with a list that has substantially more than 25-100 items.
- If they do, it had BETTER have search capabilities
And here we are still talking about list boxes. What about the container it uses, that you claim is comparable but superior to vector (i.e. a simple, contiguous, random access container)? Or have you built memory management right into your list box?
The default allocator in STL is essentially new([])/delete([]), and vectors are almost always mentioned in the context of replacing new[]/delete[]. What could possibly be your point, then?
Secondly, there is a huge difference between implementing a custom allocator and re-implementing vector. Not to mention that a custom allocator, if made properly, could be used by any type of container.
Search capabilities can be added to my CListBox with ease. You can access the container from the listbox. How do you think I am adding 250,000 strings when the button is clicked? Which means the search could be implemented in an interface member of the CListBox, or as an outside function working on the collection itself. This is not something that would be hard to accomplish. In fact, it's ironic that I was saying this myself just last night.
With a search function in a list box (you are right, I said that last night myself) it is even viable to search a list of over 10,000,000 names for example... And it would be single selection mode then, in which the data of that person would be displayed in other controls.
And it's not like you (whoever client is) are limited to a CListItem... It stores pointers for a reason... And the container itself has SetCapacity() AND SetBlockSize() members for dealing with reallocations. I have thought this out very well.
The CListBox::AddString just creates a CListItem to add to it, and when it does calls CListBox::AddItem.
And now maybe you will realize you are talking to an IT professional. (Even though I have had no paying experience.)
Or maybe you just want to hear more about how important these things are and that you want to hear more technical details. Let's just put it this way, as far as technical issues goes...
This is my REALITY TEST, to wake up Microsoft and others out there, that .NET is a mistake... And bloated code just stacked on top of other bloated code (a lot of open source projects, and Apple's stuff as well - not open source) is not a viable solution either.
This is what I meant by 'real world' Paul. Not knowing the innards of what makes STL successful in what it is good at (speed in general, because of iterators), handicaps you (people in general) Paul.
Now understand why I threw my 2 cents in.
There, the ISSUE has been closed.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
Search capabilities can be added to my CListBox with ease. You can access the container from the listbox. How do you think I am adding 250,000 strings when the button is clicked? Which means the search could be implemented in an interface member of the CListBox, or as an outside function working on the collection itself. This is not something that would be hard to accomplish. In fact, it's ironic that I was saying this myself just last night.
That would be a linear (read: slow) search.
Quote:
Originally Posted by JamesSchumacher
And the container itself has a SetCapacity() AND SetBlockSize() members for dealing with reallocations. I have thought this out very well.
The fact that you refer to a method called SetBlockSize makes me suspect you are not talking about a strictly contigous container. Why, then, are you comparing to std::vector and not std::deque?
STL was also well thought out, you know.
Quote:
Originally Posted by JamesSchumacher
And now maybe you will realize you are talking to an IT professional.
A particularly proud one, too!
Quote:
Originally Posted by JamesSchumacher
There, the ISSUE has been closed.
Fine, I'm tired of asking questions that don't get answered.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by Hermit
That would be a linear (read: slow) search.
The fact that you refer to a method called SetBlockSize makes me suspect you are not talking about a strictly contigous container. Why, then, are you comparing to std::vector and not std::deque?
STL was also well thought out, you know.
A particularly proud one, too!
Fine, I'm tired of asking questions that don't get answered.
Yes, linear searches are slow. However, when dealing with 10,000,000 + items are you going to be able to handle that in some other type of container? The linear storage model is hard pressed at the time of 10,000,000 items. This is where you make a decision on tradeoff. And linear searches can be optimized to make sure you do not do things like this.
Code:
for (unsigned long i = 0; i < m_dwItemCount; ++i)
{
// Loading index from member variable that is not constant
// loading base pointer that is not constant itself
if (m_pDataArray[ i ].FindSubString("StringToMatch") != -1)
{
// ... Do whatever! Above is doing load from member variables
// plus implied multiplication by index and sizeof. This is SLOW!
}
}
// BETTER alternative. See below.
CListItem ** ppItems = m_pDataArray;
CListItem ** ppEnd = m_pDataArray + m_dwItemCount;
while (ppItems != ppEnd)
{
// Take this concept and apply it to the search function itself
// These concepts STACK
++ppItems;
}
This is the example of the idea of iterator approach. (Taken from what STL uses, iterators - which are in most cases pointers or pointer wrappers that use inlines)
You would be surprised by how much this makes a difference. Let me go get the link to my posted Gradient code and you can read what I commented in there.
This Thread, my last post in it Check out the main.cpp file.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
Yes, linear searches are slow. However, when dealing with 10,000,000 + items are you going to be able to handle that in some other type of container? The linear storage model is hard pressed at the time of 10,000,000 items. This is where you make a decision on tradeoff. And linear searches can be optimized to make sure you do not do things like this.
Code:
for (unsigned long i = 0; i < m_dwItemCount; ++i)
{
// Loading index from member variable that is not constant
// loading base pointer that is not constant itself
if (m_pDataArray[ i ].FindSubString("StringToMatch") != -1)
{
// ... Do whatever! Above is doing load from member variables
// plus implied multiplication by index and sizeof. This is SLOW!
}
}
// BETTER alternative. See below.
CListItem ** ppItems = m_pDataArray;
CListItem ** ppEnd = m_pDataArray + m_dwItemCount;
while (ppItems != ppEnd)
{
// Take this concept and apply it to the search function itself
// These concepts STACK
++ppItems;
}
This is the example of the idea of iterator approach. (Taken from what STL uses, iterators - which are in most cases pointers or pointer wrappers that use inlines)
You would be surprised by how much this makes a difference. Let me go get the link to my posted Gradient code and you can read what I commented in there.
This Thread, my last post in it Check out the main.cpp file.
A Hash table is not viable... There is no guarantee that with that many items that they will not collide. (And the size of the hash table! Imagine that >.< why not just use an array?) To make that work, we would have to store a collection of containers, in which you were looking for values that hash to the same value, and then you search those for an exact match. THAT is not a half bad idea, IF only used in CACHING previous results.
If you have that many items, that any set of them can be displayed at any point (random access), are you going to use a sorted tree? I can see how this would work, if it has references to load data from file and not keep everything in memory. However, why use a tree structure for a LIST?
A hash table COULD be used, based on previous searches. Although, any other data structure could be used as well. However, that is in ADDITION to the array style linear data. In any case, this is a case of CACHING the search matches. (I could write a search engine that would be as fast if not faster than google's. LMAO - First search requires looking for the file directly and see if it contains the data. This is done by having a 'crawl' to have a cache to search. This is where keywords come into play. Any search string hashes to point at a container of possible matches, and then returns them in the priority order.)
The point of the matter is this... If I can make it work in what seems the most extreme way to do it (keep it all in memory), and optimized so it works very well... I can take the same approach techniques and apply it to the other design models and make it even BETTER in that approach than someone else doing it that way. Why? I am aware of what CAN BE OPTIMIZED and how it can be, and why.
A listbox can be filled with any amount of data. Random not cached data even. (Hence the random 5 character strings to make a point)
To be able to have the extreme searching results of search engines (especially google) people do not realize that their machines do not 'SEARCH THE WEB' every time for all the possible matches. No... They check their cache, which is NOT a valid representation of the entire web, let alone of a site at any particular time.
The point here is that, to make a decision based off of all the variables, and decide what priorities must come first.
And actually, if you think about it. Linear searches are actually the fastest in some way, due to the fact you do not have to preprocess the data in order to find a match. (Exception: If an exact match was found earlier, in the case of compression, and caching those data matches (don't care where they are) in a hash table... Repetition finding) However... Linear is still considered fastest due to the fact that you cannot put together that data you are processing quicker in a way other than linear. The output is linear, therefore, any data structure dealing with it outputs the data linear. Therefore, there is no way to access it any faster than in linear form.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
This is what I meant by 'real world' Paul. Not knowing the innards of what makes STL successful in what it is good at (speed in general, because of iterators), handicaps you (people in general) Paul.
Yes, after 10 years of using it, and actually being taught by the person who created STL, I don't know STL. Shame on me.
So when are we going to get a real code to compare? That's all I care to see or be interested in.
Regards,
Paul McKenzie
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by Paul McKenzie
Yes, after 10 years of using it, and actually being taught by the person who created STL, I don't know STL. Shame on me.
So when are we going to get a real code to compare? That's all I care to see or be interested in.
Regards,
Paul McKenzie
Paul,
I would have thought by now you would have figured it out? :eek: ;)
Seriously James just seems to like to debate his point of view, despite the fact it contradicts just about every accepted premise in the industry.
Unfortunately too many developers focus on issues which are horribly outdated. Machines have grown tremendously (I remember working on a system that ran a School District with over 10,000 students...All on 16K of memory and the LARGEST disk was 128KW with a clock speed of 500Khz) is nearly every aspect.
The TRUE cost of software these days is both in time to market and post release support. Custom implementations (when a standard implementation will accomplish the job) can increase this cost significantly.
I hate to think what the impact of James' approach is to target companies who must support the software after he (and any staff trained directly by him) is gone. [Actually I book 3-5 contracts a year directly based on this type of situation].
He continualy bemoans modern (even 10+ year old) development platforms, and techniques. My professional experience has been that my current .Net applications actually perform faster [on a 1-2 year old machine], than many "Highly optimized" programs written in C++ did 5-7 years ago (again on machines that were 1-2 years old at the time the application was deployed). Additionally the current releases of these applications cost less to develop (even though hourly development rates have tripled), and technical support (including defect management) is less than 25% of what it was (based on ratio of new development hours to maintenance hours).
Re: [RESOLVED] Why does this code leak memory?
I thought that my descriptions of what I do was enough? I thought that I made myself perfectly clear in what I end up doing and how I write my code.
Here is the left button down handler code.
CListBox::OnLeftButtonDown
I will not post any more. The only reason that I made the decision to post it, is it may in fact help show somewhat how it is constructed, and for someone to better judge the quality of the control.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by TheCPUWizard
Seriously James just seems to like to debate his point of view, despite the fact it contradicts just about every accepted premise in the industry.
Now quotes from this thread are now being used on another forum.
James, if you're going to use quotes by Hermit, myself, or CPUWizard on another forum, the proper decorum is to let us know beforehand, or at the very least, "ping" the people you are refering to on the other forums.
I don't want to see a list box class. I want to see two sets of source code, one using your vector, and another using std::vector, with timing tests. I don't do list boxes (and many in the non-Visual C++ forum don't deal with this), as most of my work deals with imaging and back end issues. So a list box class means nothing to me, as a vector of system information won't mean much to a GUI designer.
This is what is asked for by any C++ programmers when confronted with the claims that "my x is faster than you're 'x'). No one is saying you didn't do a lot of work, but up to this point, there is no definitive data as to your claims.
You're lucky this is CodeGuru and not one of the established C++ newsgroups such as comp.lang.c++ The people there wouldn't take it this far without seeing an actual self-made vector class.
For example, these people have worked hard on providing test cases, reasons, actual code, timing tests, etc. when comparing their implementation of the standard containers for gaming as opposed to using the generic containers (and a lot of the 'C' library routines).
http://www.open-std.org/jtc1/sc22/wg...007/n2271.html
If you're going to make claims, what you see at that link, or even a subset of what you see there, is what is expected as far as evidence is concerned.
Regards,
Paul McKenzie
Re: [RESOLVED] Why does this code leak memory?
You know this seems to be a very complicated way of reinventing the wheel.
The Tree and list common controls have offered the dispinfo callbacks which allows a user to back the control to a container.
These allow the controls to not store string data in the control, just like James is touting his approach to the listbox. These controls have had the dispinfo support since NT3.51 and Win95 (both released in 1995).
For a nice little sample, see the LVSelState.zip file in the reply to the 'How to implement a virtual listview' post.
This sample uses a std::list that is used to back a virtual list control. The sample by default fills the list with 60,000 virtual items. The user can go up to a million by changing a spin control.
This sample uses multithreading and creates a secondary thread that fills the std::list. It's done for fun and the std::list is made thread-safe with the help of a couple of simple sync wrapper classes.
Since everyone is displaying code, I figured I'd add some too. Here's an example of attaching the std::list to the virtual list control and 'populating' the control.
Code:
// Method: Populate
// Purpose: Called within the Retrieve button to add items to
// the list control.
// NOTE: It can be argued that this type of data manager class shouldn't
// include any UI interaction. As such I could have cycled through the
// CLVItemDataList in the dialog itself and populated the list control.
// However I chose not to because that would have mean I would have had
// to publicly expose the CLVItemDataList member and force users to
// lock the list in the dialog class.
HRESULT Populate( CListCtrl& ctlList )
{
HRESULT hr = S_OK;
// Add a wait cursor
CWaitCursor wait;
long lIndex = 0;
LVITEM lvitem = { 0 };
int iItem = 0, iActualItem = 0;
lvitem.mask = LVIF_TEXT | LVIF_IMAGE | LVIF_PARAM;
// Lock the List Control window so it doesn't look
// like crap while filling
ctlList.LockWindowUpdate();
// Lock the list
CAutoLockT< CLVItemDataList > lock( &m_LVItemDataList );
// Cycle through the list and add each item into the list control
for(CLVItemDataList::iterator it = m_LVItemDataList.begin();
it != m_LVItemDataList.end();
it++)
{
CLVItemData* pLVItemData = (*it);
lvitem.iItem = iItem;
lvitem.iSubItem = 0;
lvitem.lParam = reinterpret_cast<LPARAM>(pLVItemData);
// Tell the list control to become 'Virtual' and request display
// data from the lvitem.lParam.
// See CLVSelStateDlg::OnLvnGetdispinfoList for where this happens
lvitem.iImage = I_IMAGECALLBACK;
lvitem.pszText = LPSTR_TEXTCALLBACK;
lvitem.cchTextMax = MAX_LVITEMLEN;
// Insert the item
ctlList.InsertItem(&lvitem);
// Set the text for each column (all virtual callbacks)
ctlList.SetItemText(iItem, 0, LPSTR_TEXTCALLBACK);
ctlList.SetItemText(iItem, 1, LPSTR_TEXTCALLBACK);
ctlList.SetItemText(iItem, 2, LPSTR_TEXTCALLBACK);
iItem++;
// Check if user cancelled operation
if( WAIT_OBJECT_0 == WaitForSingleObject( GetShutdownEvent(), 0 ) )
{
return hr;
}
// Pump messages to allow cancel button click
// during control filling. Populating the control
// with thousands of records takes time, so we need
// to process messages to handle the cancel button
// and close messages
if(!ProcessMessages( ))
{
return hr;
}
// Restore the wait cursor
// (if you don't do this, the wait cursor may go
// away after processing messages)
wait.Restore();
}
// Unlock the List control window
ctlList.UnlockWindowUpdate();
return hr;
}
I could post the code that shows how the control displays its data when a dispinfo message is received, but I'm sure you've seen enough of my code for the day. If you are that interested, just follow the link.
Lastly, I make no claims about performance. Certainly the virtual list control is faster than it's non-virtual counter part. And it's fast enough to be used within windows (like the right pane of explorer). That's good enough for me.
Re: [RESOLVED] Why does this code leak memory?
I was aware of this. Just as I am aware that there is a mechanism for an edit control that allows you to allocate the memory it uses.
Anyways... My point here is this. If we need something minor, and that fits better than something has loads of overhead for the job at hand (vector does no matter how you slice it), and a basic collection class is not that hard to 'roll your own' if you know what you are doing.
We could all be doing this instead
Instead of arguing over little things.
Re: [RESOLVED] Why does this code leak memory?
Quote:
Originally Posted by JamesSchumacher
and a basic collection class is not that hard to 'roll your own' if you know what you are doing.
I can imagine handing off code to clients where I would have to justify hand written linked lists and other container classes. That isn't going to fly.