CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Mar 2005
    Posts
    55

    Maximum size of an array

    Hi,

    I have a problem and I searched codeguru forum but I could not find the answer.

    I am trying to generate a non overlapped uniform random number between 1 and 2*23999976. Below is my code.


    #include<iostream>
    #include<fstream>
    using namespace std;

    main()
    {
    long int n = 2*23999976;
    long int j[n];

    ofstream out("rand.dat");

    for(long int i = 0; i < n; i++){
    found:
    j[i] = 1 + (long int) (2.0*23999976.0*rand()/(RAND_MAX+1.0));
    for(long int k = 0; k < i; k++){
    if(j[k] == j[i])goto found;}

    out << j[i] << endl;}

    out.close();
    return 0;
    }


    It compiles fine but when I run it, it says Segmentation fault. If I use small number instead of 2*23999976 then it works fine. So, I guess it is due to limit on the size of the array that I have used. what is the max. size of an 1-d array? Is there any other way to do this problem?

    Thank you very much.
    Last edited by manojg; October 29th, 2005 at 09:33 AM.

  2. #2
    Join Date
    Feb 2005
    Location
    Normandy in France
    Posts
    4,590

    Re: Maximum size of an array

    The array you defined (a VLA) contains 2*23999976*4 bytes = 184 MB.
    The stack is not big enough to contain that.
    You can allocate that on the heap, or choose a greater stack in the .DEF file.
    However, your program will probably needs much time to make its work!
    Last edited by SuperKoko; October 29th, 2005 at 04:49 PM. Reason: Correction of the array size calculus.
    "inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
    Club of lovers of the C++ typecasts cute syntax: Only recorded member.

    Out of memory happens! Handle it properly!
    Say no to g_new()!

  3. #3
    Join Date
    Mar 2005
    Posts
    55

    Re: Maximum size of an array

    Hi SuperKoko,

    Thank you very much for you reply. Could you please give an example how to do this.

  4. #4
    Join Date
    Apr 2005
    Location
    Norway
    Posts
    3,934

    Re: Maximum size of an array

    Instead of
    Code:
    long int n = 2*23999976;
    long int j[n];
    do
    Code:
    long int n = 2*23999976;
    long int *j = new long int[n];
    
    // clean up when done
    delete[] j;
    - petter

  5. #5
    Join Date
    Mar 2005
    Posts
    55

    Re: Maximum size of an array

    Thanks a lot guys.

  6. #6
    Join Date
    Feb 2005
    Location
    Normandy in France
    Posts
    4,590

    Re: Maximum size of an array

    You can also use a std::vector:
    Just replace
    Code:
    long int j[n];
    With:
    Code:
    std::vector<long int> j(n);
    A few things:
    • if RAND_MAX is not sufficient (2.0*23999976.0*rand()/(RAND_MAX+1.0)) will not return a random number between 0 and 2.0*23999976.0, but only RAND_MAX different numbers.
      So, if RAND_MAX is 32767 (as in BC++ implementation), your program will perform an infinite loop.
    • Your algorithm is extermely slow.
      To choose the last integer, it will need about 2*23999976 tests. And each test will need 2*23999975 operations. 2,303,995,296,002,400 operations! If the computer is able to perform 1000,000,000 operations each second, it will need about 2,303,995 seconds=26 days.
      The previous integer, would need about 13 days to be found.

      A much faster algorithm is possible:
      You start by creating a big table containing all ordered integers between 0 and 2*23999976.
      And than you shuffle the table.
      You just need to swap each element at table[i] with a random element at another place in the table.
    • Choose better variable names. one-letter is generally not sufficient.
      For example, j is generally not used for an array, but for an insignificant index.
    • Don't use magic numbers (numbers appearing directly in the code, and whose value is arbitraty).
      Instead, it is preferable to define constants.
      For example
      Code:
      const std::size_t RandomArraySize=2*23999976L;
    Last edited by SuperKoko; October 29th, 2005 at 12:27 PM.
    "inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
    Club of lovers of the C++ typecasts cute syntax: Only recorded member.

    Out of memory happens! Handle it properly!
    Say no to g_new()!

  7. #7
    Join Date
    Sep 2004
    Location
    Tehran(Ir)
    Posts
    469

    Re: Maximum size of an array

    Quote Originally Posted by Superkoko
    Your algorithm is extermely slow.
    To choose the last integer, it will need about 2*23999976 tests. And each test will need 2*23999975 operations. 2,303,995,296,002,400 operations! If the computer is able to perform 1000,000,000 operations each second, it will need about 2,303,995 seconds=26 days.
    well,if we want to compute it more exactly,
    Z=zigma
    in its worst case it would take : z[n(n-1)](from 1 to 2*23999976)

    z[n(n-1)] =z[n^2-n] = z[n^2]-z[n] = n(n+1)(2n+1)/6 - n(n+1)/2
    if n = 2*23999976 it would be 368627731252483959720 seconds which is 11689 years for a 1000 Mhz cpu !!

  8. #8
    Join Date
    Jun 2005
    Posts
    1,255

    Smile Re: Maximum size of an array

    Oh, a goto statement in a C program?!.
    Last edited by olivthill; October 31st, 2005 at 04:28 AM.

  9. #9
    Join Date
    Feb 2005
    Location
    Normandy in France
    Posts
    4,590

    Re: Maximum size of an array

    Quote Originally Posted by mehdi62b
    well,if we want to compute it more exactly,
    Z=zigma
    in its worst case it would take : z[n(n-1)](from 1 to 2*23999976)

    z[n(n-1)] =z[n^2-n] = z[n^2]-z[n] = n(n+1)(2n+1)/6 - n(n+1)/2
    if n = 2*23999976 it would be 368627731252483959720 seconds which is 11689 years for a 1000 Mhz cpu !!
    Actually, if we don't assume anything about the pseudo-random number generators it may take an infinite time.
    It is comparable to that problem : choosing binary random numbers (0 or 1), and wait until the returned number is zero.
    It may return after 1000 loops (with a probability equal to 1/2^1000).
    It may also never return zero (with a probability=0, but the event still exists), and in that case the program is indefinitely blocked.
    However, since that is only a pseudo-random generator, it is possible that all numbers are generated, and in that case, an infinite loop is impossible.
    It is also possible that an infinite loop has a non-null probability!
    "inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
    Club of lovers of the C++ typecasts cute syntax: Only recorded member.

    Out of memory happens! Handle it properly!
    Say no to g_new()!

  10. #10
    Join Date
    Sep 2004
    Location
    Tehran(Ir)
    Posts
    469

    Re: Maximum size of an array

    However, since that is only a pseudo-random generator, it is possible that all numbers are generated, and in that case, an infinite loop is impossible
    yes,I assumed for generating nth item,after calling Rand function for n times the probabilty would be 1.

  11. #11
    Join Date
    Feb 2004
    Location
    Helsinki, Finland
    Posts
    31

    Re: Maximum size of an array

    For problems related to printing a range of numbers in random order, random_shuffle is usually the tool of choice. In this case however, the large amount of numbers themselves take a lot of memory. Fortunately, one doesn't have to store the numbers.

    An alternative solution would be to store one bit of information about every number (whether it has been already printed or not). The next value will always be chosen randomly from the count of remaining values. Finding out the number corresponding to a value is rather slow, if almost the whole array must be read every time in order to count the preceding unprinted numbers. This can be sped up by making seek blocks, which contain the count of unprinted numbers in a specific section.

    The end result seems to be rather fast and only takes a reasonable amont of memory.

    Code:
    #include <cstdlib>
    #include <ctime>
    #include <vector>
    #include <fstream>
    
    // Print out a range of integer values in random order
    int main() {
    	int target_count = 2*23999976; // number of values
    	int target_offset = 1; // starting from value
    	
    	// divide source range into blocks with 2^n items in each
    	// where 2^n is approx. sqrt(target_count)
    	// each block is initialised with the number of unprinted values
    	int seek_block = 0;
    	for(int b = 1; b < target_count; b <<= 1) ++seek_block;
    	seek_block >>= 1;
    	std::vector<unsigned short> seek_index(target_count / (1 << seek_block) + 1, (1 << seek_block));
    
    	// main table: might require one byte per item or
    	// just one bit, depending on implementation
    	std::vector<bool> numbers_sent(target_count, false);
    
    	// initialise rand() with seed value
    	srand((unsigned)time(NULL));
    
    	std::ofstream out("rand.dat");
    
    	for(int i = 0; i < target_count; ++i) {
    		int random_source = rand();
    		if(RAND_MAX + 1 < target_count - i) {
    			// combine two random values if the range is too large
    			random_source = (random_source * (RAND_MAX + 1)) + rand();
    		}
    		// the range of produced values is reduced after every value
    		int random_target = random_source % (target_count - i);
    		int j = 0, k = 0;
    		// seek to approximate location of Nth unprinted value
    		for(; k <= random_target; k += seek_index[j], ++j);
    		--j;
    		k -= seek_index[j];
    		j <<= seek_block;
    		// find out the exact location of Nth unprinted value
    		for(; k <= random_target; k += !numbers_sent[j], ++j);
    		--j;
    		// mark value as printed
    		numbers_sent[j] = true;
    		--seek_index[j >> seek_block];
    
    		out << j + target_offset << "\n";
    	}
    	out.close();
    	return 0;
    }

  12. #12
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Maximum size of an array

    Code:
    std::vector<bool> numbers_sent(target_count, false);
    Use bitset instead of vector<bool>. This saves a lot more space:
    Code:
    #include <bitset>
    //...
    std::bitset<2*23999976> numbers_sent;
    Now you are dealing with actual bits, making the number of bytes allocated around 5,999,994.

    Regards,

    Paul McKenzie

  13. #13
    Join Date
    Feb 2004
    Location
    Helsinki, Finland
    Posts
    31

    Re: Maximum size of an array

    Quote Originally Posted by Paul McKenzie
    Use bitset instead of vector<bool>. This saves a lot more space

    Now you are dealing with actual bits, making the number of bytes allocated around 5,999,994.
    vector<bool>
    bitset

    In theory, there doesn't seem to be much of a difference, except that vector is more common and its syntax can often be recited from memory. Also, it doesn't require a constant number of entries. In practice though, not all vector<bool> implementations are memory optimised.

  14. #14
    Join Date
    Oct 2000
    Location
    London, England
    Posts
    4,773

    Re: Maximum size of an array

    vector<bool> is a specialisation but should generally be avoided.

    You could use a std::set and add each number as it's generated. You would have the advantage of saving space but would have an O(log N) lookup as you generated more numbers. You'd also have to generate another random number whenever you hit a previously generated one.

    Another alternative is to use std::map< int, int > (or unsigned ints).
    Whenever you generate a number you do this:
    - Look up in the map. If it exists then your new entry is the "second" value.
    - If it does not exist then it is the first value.

    You must keep a count of the remaining "size", i.e. you count down.
    Each time you generate a number you do the following:
    - Reduce your size by 1.
    - Move your last element into the space of the one you last picked. (If your last element is mapped, move its contents into the cell you picked and delete the previous entry from the map). If the last space you picked was mapped, overwrite the old value with the new one.

    This would probably be the most efficient method if you are picking a medium-sized number of elements and you want to be more efficient in memory space (i.e. not cache a large table).

  15. #15
    Join Date
    Feb 2004
    Location
    Helsinki, Finland
    Posts
    31

    Re: Maximum size of an array

    Quote Originally Posted by NMTop40
    vector<bool> is a specialisation but should generally be avoided.
    "Vector<bool> Considered Harmful"? Are there any other problems than the obvious performance/efficiency-related?

    ...
    This would probably be the most efficient method if you are picking a medium-sized number of elements and you want to be more efficient in memory space (i.e. not cache a large table).
    Perhaps I didn't understand your cunning plan, but how does this reduce the peak memory consumption, especially after map/set container overhead? If we decide there's enough memory to keep every value in memory at some point, then we might as well go down the random_shuffle route, which would be O(n).

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured