-
October 29th, 2005, 08:54 AM
#1
Maximum size of an array
Hi,
I have a problem and I searched codeguru forum but I could not find the answer.
I am trying to generate a non overlapped uniform random number between 1 and 2*23999976. Below is my code.
#include<iostream>
#include<fstream>
using namespace std;
main()
{
long int n = 2*23999976;
long int j[n];
ofstream out("rand.dat");
for(long int i = 0; i < n; i++){
found:
j[i] = 1 + (long int) (2.0*23999976.0*rand()/(RAND_MAX+1.0));
for(long int k = 0; k < i; k++){
if(j[k] == j[i])goto found;}
out << j[i] << endl;}
out.close();
return 0;
}
It compiles fine but when I run it, it says Segmentation fault. If I use small number instead of 2*23999976 then it works fine. So, I guess it is due to limit on the size of the array that I have used. what is the max. size of an 1-d array? Is there any other way to do this problem?
Thank you very much.
Last edited by manojg; October 29th, 2005 at 09:33 AM.
-
October 29th, 2005, 09:36 AM
#2
Re: Maximum size of an array
The array you defined (a VLA) contains 2*23999976*4 bytes = 184 MB.
The stack is not big enough to contain that.
You can allocate that on the heap, or choose a greater stack in the .DEF file.
However, your program will probably needs much time to make its work!
Last edited by SuperKoko; October 29th, 2005 at 04:49 PM.
Reason: Correction of the array size calculus.
"inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
Club of lovers of the C++ typecasts cute syntax: Only recorded member.
Out of memory happens! Handle it properly!
Say no to g_new()!
-
October 29th, 2005, 10:13 AM
#3
Re: Maximum size of an array
Hi SuperKoko,
Thank you very much for you reply. Could you please give an example how to do this.
-
October 29th, 2005, 10:30 AM
#4
Re: Maximum size of an array
Instead of
Code:
long int n = 2*23999976;
long int j[n];
do
Code:
long int n = 2*23999976;
long int *j = new long int[n];
// clean up when done
delete[] j;
- petter
-
October 29th, 2005, 10:41 AM
#5
Re: Maximum size of an array
-
October 29th, 2005, 12:18 PM
#6
Re: Maximum size of an array
You can also use a std::vector:
Just replace
With:
Code:
std::vector<long int> j(n);
A few things:
- if RAND_MAX is not sufficient (2.0*23999976.0*rand()/(RAND_MAX+1.0)) will not return a random number between 0 and 2.0*23999976.0, but only RAND_MAX different numbers.
So, if RAND_MAX is 32767 (as in BC++ implementation), your program will perform an infinite loop. - Your algorithm is extermely slow.
To choose the last integer, it will need about 2*23999976 tests. And each test will need 2*23999975 operations. 2,303,995,296,002,400 operations! If the computer is able to perform 1000,000,000 operations each second, it will need about 2,303,995 seconds=26 days.
The previous integer, would need about 13 days to be found.
A much faster algorithm is possible:
You start by creating a big table containing all ordered integers between 0 and 2*23999976.
And than you shuffle the table.
You just need to swap each element at table[i] with a random element at another place in the table. - Choose better variable names. one-letter is generally not sufficient.
For example, j is generally not used for an array, but for an insignificant index. - Don't use magic numbers (numbers appearing directly in the code, and whose value is arbitraty).
Instead, it is preferable to define constants.
For example
Code:
const std::size_t RandomArraySize=2*23999976L;
Last edited by SuperKoko; October 29th, 2005 at 12:27 PM.
"inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
Club of lovers of the C++ typecasts cute syntax: Only recorded member.
Out of memory happens! Handle it properly!
Say no to g_new()!
-
October 29th, 2005, 01:50 PM
#7
Re: Maximum size of an array
Originally Posted by Superkoko
Your algorithm is extermely slow.
To choose the last integer, it will need about 2*23999976 tests. And each test will need 2*23999975 operations. 2,303,995,296,002,400 operations! If the computer is able to perform 1000,000,000 operations each second, it will need about 2,303,995 seconds=26 days.
well,if we want to compute it more exactly,
Z=zigma
in its worst case it would take : z[n(n-1)](from 1 to 2*23999976)
z[n(n-1)] =z[n^2-n] = z[n^2]-z[n] = n(n+1)(2n+1)/6 - n(n+1)/2
if n = 2*23999976 it would be 368627731252483959720 seconds which is 11689 years for a 1000 Mhz cpu !!
-
October 29th, 2005, 04:36 PM
#8
Re: Maximum size of an array
Oh, a goto statement in a C program?!.
Last edited by olivthill; October 31st, 2005 at 04:28 AM.
-
October 29th, 2005, 04:44 PM
#9
Re: Maximum size of an array
Originally Posted by mehdi62b
well,if we want to compute it more exactly,
Z=zigma
in its worst case it would take : z[n(n-1)](from 1 to 2*23999976)
z[n(n-1)] =z[n^2-n] = z[n^2]-z[n] = n(n+1)(2n+1)/6 - n(n+1)/2
if n = 2*23999976 it would be 368627731252483959720 seconds which is 11689 years for a 1000 Mhz cpu !!
Actually, if we don't assume anything about the pseudo-random number generators it may take an infinite time.
It is comparable to that problem : choosing binary random numbers (0 or 1), and wait until the returned number is zero.
It may return after 1000 loops (with a probability equal to 1/2^1000).
It may also never return zero (with a probability=0, but the event still exists), and in that case the program is indefinitely blocked.
However, since that is only a pseudo-random generator, it is possible that all numbers are generated, and in that case, an infinite loop is impossible.
It is also possible that an infinite loop has a non-null probability!
"inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
Club of lovers of the C++ typecasts cute syntax: Only recorded member.
Out of memory happens! Handle it properly!
Say no to g_new()!
-
October 29th, 2005, 06:12 PM
#10
Re: Maximum size of an array
However, since that is only a pseudo-random generator, it is possible that all numbers are generated, and in that case, an infinite loop is impossible
yes,I assumed for generating nth item,after calling Rand function for n times the probabilty would be 1.
-
October 30th, 2005, 04:54 PM
#11
Re: Maximum size of an array
For problems related to printing a range of numbers in random order, random_shuffle is usually the tool of choice. In this case however, the large amount of numbers themselves take a lot of memory. Fortunately, one doesn't have to store the numbers.
An alternative solution would be to store one bit of information about every number (whether it has been already printed or not). The next value will always be chosen randomly from the count of remaining values. Finding out the number corresponding to a value is rather slow, if almost the whole array must be read every time in order to count the preceding unprinted numbers. This can be sped up by making seek blocks, which contain the count of unprinted numbers in a specific section.
The end result seems to be rather fast and only takes a reasonable amont of memory.
Code:
#include <cstdlib>
#include <ctime>
#include <vector>
#include <fstream>
// Print out a range of integer values in random order
int main() {
int target_count = 2*23999976; // number of values
int target_offset = 1; // starting from value
// divide source range into blocks with 2^n items in each
// where 2^n is approx. sqrt(target_count)
// each block is initialised with the number of unprinted values
int seek_block = 0;
for(int b = 1; b < target_count; b <<= 1) ++seek_block;
seek_block >>= 1;
std::vector<unsigned short> seek_index(target_count / (1 << seek_block) + 1, (1 << seek_block));
// main table: might require one byte per item or
// just one bit, depending on implementation
std::vector<bool> numbers_sent(target_count, false);
// initialise rand() with seed value
srand((unsigned)time(NULL));
std::ofstream out("rand.dat");
for(int i = 0; i < target_count; ++i) {
int random_source = rand();
if(RAND_MAX + 1 < target_count - i) {
// combine two random values if the range is too large
random_source = (random_source * (RAND_MAX + 1)) + rand();
}
// the range of produced values is reduced after every value
int random_target = random_source % (target_count - i);
int j = 0, k = 0;
// seek to approximate location of Nth unprinted value
for(; k <= random_target; k += seek_index[j], ++j);
--j;
k -= seek_index[j];
j <<= seek_block;
// find out the exact location of Nth unprinted value
for(; k <= random_target; k += !numbers_sent[j], ++j);
--j;
// mark value as printed
numbers_sent[j] = true;
--seek_index[j >> seek_block];
out << j + target_offset << "\n";
}
out.close();
return 0;
}
-
October 30th, 2005, 07:01 PM
#12
Re: Maximum size of an array
Code:
std::vector<bool> numbers_sent(target_count, false);
Use bitset instead of vector<bool>. This saves a lot more space:
Code:
#include <bitset>
//...
std::bitset<2*23999976> numbers_sent;
Now you are dealing with actual bits, making the number of bytes allocated around 5,999,994.
Regards,
Paul McKenzie
-
October 31st, 2005, 01:39 AM
#13
Re: Maximum size of an array
Originally Posted by Paul McKenzie
Use bitset instead of vector<bool>. This saves a lot more space
Now you are dealing with actual bits, making the number of bytes allocated around 5,999,994.
vector<bool>
bitset
In theory, there doesn't seem to be much of a difference, except that vector is more common and its syntax can often be recited from memory. Also, it doesn't require a constant number of entries. In practice though, not all vector<bool> implementations are memory optimised.
-
October 31st, 2005, 05:44 AM
#14
Re: Maximum size of an array
vector<bool> is a specialisation but should generally be avoided.
You could use a std::set and add each number as it's generated. You would have the advantage of saving space but would have an O(log N) lookup as you generated more numbers. You'd also have to generate another random number whenever you hit a previously generated one.
Another alternative is to use std::map< int, int > (or unsigned ints).
Whenever you generate a number you do this:
- Look up in the map. If it exists then your new entry is the "second" value.
- If it does not exist then it is the first value.
You must keep a count of the remaining "size", i.e. you count down.
Each time you generate a number you do the following:
- Reduce your size by 1.
- Move your last element into the space of the one you last picked. (If your last element is mapped, move its contents into the cell you picked and delete the previous entry from the map). If the last space you picked was mapped, overwrite the old value with the new one.
This would probably be the most efficient method if you are picking a medium-sized number of elements and you want to be more efficient in memory space (i.e. not cache a large table).
-
October 31st, 2005, 08:11 AM
#15
Re: Maximum size of an array
Originally Posted by NMTop40
vector<bool> is a specialisation but should generally be avoided.
"Vector<bool> Considered Harmful"? Are there any other problems than the obvious performance/efficiency-related?
...
This would probably be the most efficient method if you are picking a medium-sized number of elements and you want to be more efficient in memory space (i.e. not cache a large table).
Perhaps I didn't understand your cunning plan, but how does this reduce the peak memory consumption, especially after map/set container overhead? If we decide there's enough memory to keep every value in memory at some point, then we might as well go down the random_shuffle route, which would be O(n).
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|