Click to See Complete Forum and Search --> : Not repeating an operation to cutdown runtime


chris_t_morin
June 1st, 2008, 02:56 PM
Hi,

In my code, I have make an array a size dependent on the amount of numbers that meet two conditions and then fill that array with those numbers. The following code work, but I have to do the exact same operation twice, once to find the size of the array and the second to fill it. Is there a way to avoid this because if NUM is very big this process becomes very long (it's basically prime factorization).

Also, in the if statements, when the computer finds that the first one is false, does it bother checking the second or does it continue on?


class survivor
{
public:
int * pps;
int sizepps;
survivor()
{
sizepps = 0;
for(long long i = 3; i<NUM;i+=2)
{
if(checkPrime(i)&& NUM%i==0)
++sizepps;
}
pps = new int[sizepps]
int curloc=0;
for(long long i=3; i<NUM; i+=2)
{
if(checkPrime(i)&& NUM%i==0)
pps[curloc] = i;
}
}
};



Thanks for the help

Lindley
June 1st, 2008, 03:02 PM
I'm first going to explain the theoretical way you should handle this, then I'll tell you the practical way it should be done.

When you're filling an array and you don't know its size beforehand, you begin by assuming it has some size, and you set a capacity variable to this and allocate it.

If at any time you run out of room, you double the capacity, allocate a new array, copy the old values into it, and free the old one. Continue until you're done. Optionally, you can do one final allocate/copy/free to cut down your array size, but usually it doesn't matter too much that you're over-allocated at the end.

In C, the realloc() function helps with this process.

C++ doesn't have such a thing; but it *does* have std::vector, which essentially does precisely this process internally as you push_back() each element. So I'd suggest just using a std::vector<int> here.

TheCPUWizard
June 1st, 2008, 03:13 PM
Lindley, Good Post... :wave:

A couple of things...

1) The "doubling" works very well for small to medium sizes, but can get "nasty" (consider if you hage a 1GB vertor, and need to add one element. :eek:

Granted, you probable already have other problems...but one solution is to "cap" the size increase. Some classes I have seen use a hard cap of a page-size, other use a logarythmic rolloff.

2) Again with larg objects..It may be better to wrap the structure to use segments. [side note: if using CLR this is critical at the 80KB boundary!!!!].

Internally you use a List<Vector<T>> (or equiv), and perform div/mod on the provided flat index....


Again, both of these are only worth considering for "large" allocations....

Hermit
June 1st, 2008, 03:36 PM
2) Again with larg objects..It may be better to wrap the structure to use segments. [side note: if using CLR this is critical at the 80KB boundary!!!!].

Internally you use a List<Vector<T>> (or equiv), and perform div/mod on the provided flat index....
Wouldn't an STL deque be suitable for this purpose (in most cases)?

spoon!
June 2nd, 2008, 12:47 AM
1) The "doubling" works very well for small to medium sizes, but can get "nasty" (consider if you hage a 1GB vertor, and need to add one element. :eek:

Granted, you probable already have other problems...but one solution is to "cap" the size increase. Some classes I have seen use a hard cap of a page-size, other use a logarythmic rolloff.
From a CS point of view, the "doubling" is precisely what's good about it at large sizes. Increasing by a fixed percentage gives us the amortized constant time complexity of adding elements to a vector (which is required by the C++ standard for vectors). If you "cap" the size increase, you will make adding elements an amortized linear time operation.

DreamShore
June 2nd, 2008, 03:10 AM
Problem here is you need either remember the result, or calculate them again. And recalculating is seemingly more expensive.

A most simple way is a constant sized buffer, that you know the work won't exceed that buffer.

If you are not sure about the working size, dynamic-sized buffer.

A chain of buffers is a little complex, but could be more efficent.

If you are handling something too large, write it to file. Files are good dynamic-sized buffers if you don't access them often.

p.s. maybe just loop until NUM / the_first_prime_you_found, or simpler NUM / 2, which is less efficient.

chris_t_morin
June 3rd, 2008, 01:54 AM
It's more complicated than I thought it was. The vector thing looks promising but I'll have to read up on them, I only started programming a couple weeks ago.

@ Dreamshore: Is a separate file the only kind of buffer, cause It really would have to be a dynamic sized buffer.

Thanks for your help people.

DreamShore
June 3rd, 2008, 02:33 AM
Is a separate file the only kind of buffer, cause It really would have to be a dynamic sized buffer.

I don't quite get it. Buffer can be anything the data can be stored. You can write down the result on a paper, and the paper is the buffer.

Use file when the data is too large that beyond the capacity of your RAM. I don't really think it would happen in your case anyway.

The mechanism of file is similar to a chained buffer, though. Maybe there is the implementation of chained buffer in STL. Others here are more familiar with STL than I.

--

Oh, forgot to say. If I were you, I'll just use a constant sized buffer if I only need to deal with int. Primes below 2 ^ 31 can't be that many.

*EDIT... well I found that the number of primes is far beyond my imagination... But anyway, number below 2 ^ 32 is at most a product of 32 numbers - exactly the size of buffer you need.

Well edit again, not exactly, but at most. I forget that you are not listing all prime factors.

DreamShore
June 3rd, 2008, 03:06 AM
The conclusion: number below 2 ^ 32 (not included) can not be made up of more than 9 different prime numbers. You don't need to worry about memory at all.