Re: Any way to improve this?
Try using a list instead of a vector, I'm not sure if this is true or not but every time you call push_back with the vector it has to allocate a new array and copy all the elements into it again. Since you don't really need the random access of the vector, it will function the same way if you use a list and the list doesn't take a lot of time when it grows.
Re: Any way to improve this?
Hi.
Quote:
Originally Posted by IllegalCharacter
every time you call push_back with the vector it has to allocate a new array and copy all the elements into it again
Not 'every time'... but, usually (some implementations) when you reach the end of a vector it's resized to twice the original size. That's why iterators to vector can become invalid. Note, however, that a list will actually allocate memory for 'every' push back.
Anyway, if the file is always around this big, a first (and quick) attempt to improve performance is to initialize the vector with 400.000 elements. Try this and let us know.
Re: Any way to improve this?
IllegalCharacter has a good point here.
Vectors of this size don't always work out so well, and as you push the limits of RAM you run into a RAM fragmentation problem.
That is, unless you have a 64bit OS and gobs of RAM.
Certainly we know that the string will allocate it's own space, and that will be a small amount, but the vector is 'an array of string objects' that require contiguous RAM. As you fill the vector, the problem IllegalCharacter identified causes RAM to be fragmented, among all those small string objects.
Eventually you reach a 'straw that breaks the camel's back' situation, where you have available RAM, with VM to back it up, but you get an out of memory error because a contiguous block of RAM can no longer be found to expand the vector, for it's array of string objects.
The list all but eliminates this problem, so long as you don't have OTHER large sized objects requiring contiguous RAM for allocation.
The copying of the vector is also a speed issue that is alleviated with 'size' for the projected volume - and, too, if it succeeds, it succeeds. That helps avoid the fragmentation because, since it no longer copies or moves the vector - things stay fairly organized in RAM (give or take).
If you're working on example code, this is all fine - you get it working, move on to the next assignment.
On the other hand, if you're preparing to write 'production code' - this is one of those issues that can require understanding. It's also a reason C# is becoming to popular - you don't have to think of this stuff so much.
This whole problem was much worse 'back in the day' - when 32Mbytes of RAM was $2,000. These days, though - when the problem sizes increase, it works out to be the same thing all over again.
Re: Any way to improve this?
Code:
list<string>::iterator i;
list<string>::iterator j;
for (i = wordList.begin(); i != wordList.end(); ++i)
{
const string & word1 = *i;
for (j = ++i; j != wordList.end(); ++j)
{
int count = 0;
const string & word2 = *j;
for (int k = 0; k < 5; ++k)
{
if (word1[k] != word2[k])
{
if( ++count > 1) break;
}
}
if (count == 1)
{
g.addEdge(word1, word2, 1);
g.addEdge(word2, word1, 1);
}
}
}
Ok, so I tried implementing it as a list and I am trying to use iterators to search through the list, but I am having trouble trying to figure out how to use them so they still compare all pairs of words.
Re: Any way to improve this?
Your new code should work, except for when you assign j to ++i. This will correctly assign j, however it will also increment your i. Then when the end of the first for loop is reached, your i will be incremented as well, so you'll be jumping over every second word. I assume this is not what you want, since your first program did not do this.
Quote:
Originally Posted by ltcmelo
Anyway, if the file is always around this big, a first (and quick) attempt to improve performance is to initialize the vector with 400.000 elements. Try this and let us know.
This should work just fine, also without much worries about memory fragmentation as you are not allocating several small chunks. If there is enough memory free (not necessarily contiguous) then I believe the OS will defragment the memory to make a block big enough for your vector and all will be nice. Lists will also get the job done, they DO allocate memory each time but its not much and there is no copying of data each time which is the real slow-down of a vector.
Re: Any way to improve this?
How can I change the assignment to j so that it works correctly and is always one place ahead of i? I'm having trouble with this.
Re: Any way to improve this?
Quote:
If there is enough memory free (not necessarily contiguous) then I believe the OS will defragment the memory to make a block big enough for your vector
This is true in managed C++, if using managed types - but this isn't the managed C++ forum, so I assume it's not a managed target.
In native (unmanaged, as in the way it's been for decades) - there is no de-fragmentation of RAM by the standard CRT.
It can 'psuedo' de-fragment, if you release all memory :)
That's why I pointed this out in the earlier post in this thread.
Re: Any way to improve this?
Quote:
Originally Posted by Ehump20
How can I change the assignment to j so that it works correctly and is always one place ahead of i? I'm having trouble with this.
When you initialize j, you set it equal to i, and then before you do anything to it you put j++, like this:
Code:
for(j = i; j < wordList.end(); j++){
j++;
//... etc...
Quote:
Originally Posted by JVene
there is no de-fragmentation of RAM by the standard CRT.
I've never actually used managed C++, so I don't know how it handles memory allocation. Doesn't Windows handle memory defragmentation? It's silly that it doesn't (if it doesn't), however I suppose Windows isn't exactly the most efficient software out there...
Re: Any way to improve this?
It's not so much a 'Windows' problem, but an issue of how memory is represented.
Say you have a few thousand objects allocated from the heap, all held by pointers.
Ask yourself, how could the operating system move the memory around without invalidating the pointers?
It can't.
In managed code, it's expected that physical locations can change because the allocated memory isn't held by a pointer, but by a handle. The execution of the program is stopped while .NET's runtime moves memory around. The definition of the language (even the extensions to C++) allow this to happen, such that when the defrag is over, execution can continue without having invalidated handles identifying allocated material.
To say it's silly that Windows doesn't do this for unmanaged code is, well....a bit naive, if you'll pardon my saying that (I don't mean to be rude about that). This is true for all C and C++ (Pascal and a host of other languages, too) - whether in Linux, Mac, DOS, Windows - anywhere.
The entire concept of garbage collection (which can be read as de-fragmentation of the heap) is a computer science classic. There may be C or C++ 'like' languages that support it, there may even be a garbage collected manager for C and C++ you can use (I'm not aware of any).
Garbage collection imposes a performance penalty. For certain targets it can be a reasonable performance penalty, favoring the management of the heap over speed. There are entire categories of application targets for which the performance penalty is inappropriate, and those targets are exactly where C++ is most suitable. For that, C and C++ developers must take the issue into account. It's obviously viable to do so, because some of the world's most ambitious, reliable and high performance applications are written in C++ (3DS Max and Photoshop, for example).
This issue is exactly why some developers prefer C# over C++ (and, conversely, why C++ programmers prefer C++ - when they know how to handle the issue).
Re: Any way to improve this?
The reason I believe that defragmentation isn't done in native C++ is because it's too low level.
Managed C++ is another layer of abstraction on top of C++, where it includes a higher level memory allocation and garbage collection scheme.
At the native C++ level, all memory is flat. On a 32-bit PC with up to 4GB of ram, all that memory is addressable with a 32-bit pointer. When calling malloc/new you are returned a pointer for that 4GB span of addressable space. It's static. If you move the memory block, the pointer becomes invalid.
All Intel-based processors from 386's and up provided the 'protected mode' that officially made the 640k barrier obsolete. (remember that?) When in protected mode, a 32-bit pointer addresses up to 4GB of RAM, instead of the segment:index pair that could address only 1MB. However, the 32-bit pointers are still relative to a base offset that's defined in the descriptor table. Code references a descriptor by referencing the index (aka selector) in the descriptor table of the element and places it in the appropriate 'ds' or 'es' selector register (no longer a segment register).
It would be feasible then to create descriptors for separate memory blocks, where you could defragment silently by moving the memory and updating the descriptor. The selector and pointer will remain valid.
Great idea, but it is not always feasible for code to create a descriptor for every block of memory ever allocated. The local descriptor table can only hold 8191 elements. Therefore, we must utilize static pointers on a common selector. Usually, this is a local descriptor table associated with the running process. Therefore, you have only access to memory in your own process and cannot corrupt memory outside, or you get a 'segmentation fault' or 'page fault exception' depending on your platform.
Sorry for the ramble. :)
Re: Any way to improve this?
I love how this topic went from helping somebody speed something up to a discussion of memory fragmentation, etc. You guys should teach a course ;)
Quote:
Originally Posted by JVene
Ask yourself, how could the operating system move the memory around without invalidating the pointers?
It can't.
Actually it can, since the memory addresses used in a program are in logical memory space, whereas the OS has access to supervisor mode and can manipulate memory pages in machine space. I don't think this has an effect on what we are talking about, but anyway.
I myself am not a huge fan of managed languages, as the garbage collection system can be a bit clunky and you can't do pointer tricks! I love the beauty of a good smart pointer.
Re: Any way to improve this?
Quote:
Originally Posted by JVene
Vectors of this size don't always work out so well, and as you push the limits of RAM you run into a RAM fragmentation problem.
That is, unless you have a 64bit OS and gobs of RAM.
What "that size"? What "gobs of RAM"?
400,000 5-characters words take 2MB, 2.5MB if you count zero-terminators.
This amount of memory is not a problem for years now.
If all words are 5 characters long, there is a very little benefit of keeping them in strings. You would use "string" class when you do some string manipulations, specifically if you need to change its size.
A simple 6-char buffer will do just fine.
You can get the size of the file, divide by 6 to get maximum number of 5-characters words and reserve() that space in your vector. Then, you don't even need a vector. With a fixed word length, a char array will do too.
Also, you need to do some sort of profiling before talking about performance. Do you know WHAT takes 30 seconds? Put a few call to GetTickCount() between your blocks of code to narrow down the problem.
Looks like your triple-nested loop is a suspect. Also, how much time does it take to add things to your Graph? Comment all (well, all three) calls to it and see if it saves any time (or most of it).
BTW, using list for a large number of small objects is NOT a good idea. The overhead will kill you. The benefits of the list over vector are cheap insertions / additions / deletes. The code here doesn't insert or delete, and the additions are taken care of by a call to reserve(). The drawback of the list is more expensive access, which we have plenty of in this example.
Re: Any way to improve this?
How do I create a 6 char buffer for this?
Re: Any way to improve this?
Some things:
1) What is the number of 5 letter words ? If there are only 400000
total words in the file, I suspect the number of 5 letter words is
a lot less.
2) Even if all the words are 5 letters, it should not take long to read
them and push them onto a vector. Have you timed this part only ?
I doubt that this will take much over a second. I doubt changing to a list
or a char[6] buffer will decrease your total time by much.
3) The main problem is the algorithm is O(N*N). Unfortunately, off the
top of my head, I can not think of any way of reducing this do
O(N*LOGN) or better.