Generating big random numbers in C

I want to generate big random numbers in C(not C++ please).By "big" I mean integers much bigger than srand(time(NULL)) and rand() functions' limit(32767).

I tried writing: (note:I am not able to see "code" tag button in this editor,so I am not using it)

//****

int randomnumber;

srand( time(NULL) );

randomnumber = (( rand() % 33 ) * ( rand() % 33 ) * ( rand() % 33) * ( rand() * 33) * (rand() % 33 )) + 1

//****

But I have doubts about it's randomness quality.Also there is another problem,the program can't know the maximum random number it should use before user input,so maximum random number may need to use much smaller maximum random number according to user input.

Is there a better algorithm to create quality big random numbers in C?

Re: Generating big random numbers in C

My recommendation is to read this excellent article and come back here with any questions: http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx

gg

Re: Generating big random numbers in C

Quote:

Originally Posted by

**AwArEnEsS**
Is there a better algorithm to create quality big random numbers in C?

How about this? http://www.cs.hmc.edu/~geoff/mtwist.html

Re: Generating big random numbers in C

Quote:

Originally Posted by

**AwArEnEsS**
Is there a better algorithm to create quality big random numbers in C?

If you want better quality than rand() you must use a better random number generator as has been suggested. But if you're satisfied with the statistical properties of rand() it's easy to use rand() to generate bigger numbers with the same quality.

rand() generates uniformly distributed random numbers (each number appears with equal probability) and the bigger numbers should follow the same distribution. You're on the right track by using rand() repeatedly but unfortunately simply multiplying (or adding) the numbers skews the distribution so it's no longer will be uniform. Do this instead,

Code:

`int random = rand() & 255;`

random = random<<8 | (rand() & 255);

random = random<<8 | (rand() & 255);

random = random<<7 | (rand() & 127);

// or crammed

int random = (((rand() & 255)<<8 | (rand() & 255))<<8 | (rand() & 255))<<7 | (rand() & 127);

This assumes ints are 32 bits and that rand() generates numbers not smaller than 8 bits (a byte). All in all 31 bits are extracted from four rand() calls and concatenated to form an int. That will give you a random uniformly distributed number withing the full range of a positive 32 bit int.

To change the size you just use the modulo operator. Say you want to limit the random number to be somewhere from 0 up to and including N you add this after the code above,

Code:

`random = random % (N+1);`

Re: Generating big random numbers in C

I was a bit busy and didn't work on programming in last days.

Thank you all for your help,I will try them.

Re: Generating big random numbers in C

the problem with using rand() (or any other PRNG function) to generate bigger random numbers than it can really handle out of it's own.

- You will probably reduce the periodicity of the PRNG. So instead of repeating the RNG pattern over X, it will repeat over X/2, X/4, X/8....

- The resultant numbers will probably not be properly distributed PRNG's, it is somewhat unlikely that you will end up generating EVERY possible value in the range, you will likely end up generating numbers where some values cannot ever occur. Or where some numbers will occur significantly more frequent than others.

These issues may or may not be a problem for you.

So it depends what you really need out of this, but the only "true" way is tailoring an existing PRNG function towards the number of bits you need.

Re: Generating big random numbers in C

>> **That will give you a random uniformly distributed number withing the full range of a positive 32 bit int.**

That doesn't seem correct to me. For example, to get a number between 0 and 127, rand() would have to return 0 twice (3 times to get "0"). If you see bits being thrown away, or RAND_MAX not being considered, then that should raise a red flag for distribution.

gg

Re: Generating big random numbers in C

Quote:

Originally Posted by

**Codeplug**
That doesn't seem correct to me. For example, to get a number between 0 and 127, rand() would have to return 0 twice (3 times to get "0"). If you see bits being thrown away, or RAND_MAX not being considered, then that should raise a red flag for distribution.

I don't quite understand the nature of your objection.

I've assumed that 8 uniformly distributed bits can be extracted from each rand() call using a modulo 256 operation. And they approximately will if the random numbers delivered by rand() are drawn from a range reasonably larger than 0-255.

It doesn't matter how high quality the random number is (say it's generated with a Marsenne twister). If a modulo operation is used to limit it a non-uniformity will sneak into the bit distribution regardless.

I didn't say using rand() to generate big random numbers is perfect but it's simple and the quality will be fairly good. If rand() is good enougth in the first place it's good enought for bigger random numbers.

Re: Generating big random numbers in C

>> **I don't quite understand the nature of your objection.**

>> **... **__uniformly distributed number__ withing the full range of a positive 32 bit int.

My objection is that it's not a uniform distribution - ie. there is *not* an equal probability of seeing every value in the range [0, 2147483647). Or even [0, N], when you do "random % (N+1)".

BTW, I'm assuming you meant for the "rand() & 127" byte to be the most-significant byte instead of the least. And the shifting in the "// or crammed" part is wrong.

But yeah, it may be "good enough".

gg

Re: Generating big random numbers in C

Quote:

Originally Posted by

**Codeplug**
>> **I don't quite understand the nature of your objection.**

>> **... **__uniformly distributed number__ withing the full range of a positive 32 bit int.

My objection is that it's not a uniform distribution - ie. there is *not* an equal probability of seeing every value in the range [0, 2147483647). Or even [0, N], when you do "random % (N+1)".

Well, as I said it doesn't matter how high quality the random integer. If you use modulo N to "cap" it, a small non-uniformity will sneak in. But the severity will be limited as long as N is small in relation to the total range of the random numbers. And the non-uniformity is not anyway near what you will get by multiplying (or adding) random numbers (like the OP did initially). That will totally destroy a uniform distribution rendering the big random numbers useless. The modulo bias problem is small in comparision.

But to be sure the OP could check RAND_MAX and only use say half of the random bits rand() delivers to create the bigger random number. The lowest value of RAND_MAX I've ever seen is 32767. This corresponds to 15 bits so using 8 bits like I suggested is likely to be fine. But putting in an assert to check it is never wrong of course.

Quote:

BTW, I'm assuming you meant for the "rand() & 127" byte to be the most-significant byte instead of the least. And the shifting in the "// or crammed" part is wrong.

I mean to extract the 7 rightmost bits. That will give a total of 8+8+8+7 = 31 random bits and compose a random positive 32 bit int.

How is the "crammed" version wrong?

Quote:

But yeah, it may be "good enough".

That's right. Optimally you use a random number generator that delivers a random integer within the wanted range. But just lifting some algorithm from the net is risky too. It may not be as good as the author believs or claims and it may have bugs. At least rand() is time-tested.

It's not that hard to fix the modulo bias problem. This is how it's done in Java. See the nextInt(int n) method of the Random class,

http://docs.oracle.com/javase/7/docs/api/

But Java aside, in my view if the C application isn't important enough to warrant a commersial generator AND to employ an expert to review the statistics then using rand() the way I suggested definately is "good enought". And regardless of how much effort is put in, random numbers generated on digital computers will always be "pseudo" anyway.

Re: Generating big random numbers in C

>> **If you use modulo N to "cap" it, **__a small non-uniformity will sneak in__.

This statement is only true under the following assumptions:

- the PRNG returns uniformly distributed numbers within some range, "[0, RAND_MAX]" for rand()

- N is sufficiently smaller than RAND_MAX (*you covered this one...*)

- for smaller N's, the PRNG should have good randomness in the low-order bits

- the range is a multiple of N, "(RAND_MAX+1) % N == 0"

http://c-faq.com/lib/randrange.html

>> **... so using 8 bits like I suggested is likely to be fine.**

Maybe, maybe not. It's not hard to eliminate the "maybe" by utilizing all the bits from rand().

>> **[use rand] to create the bigger random number.**

Right, rand() can be used to generate a random string of bits of any length. My original objection was the resulting integers taken from those bits are not uniformly distributed. Only the string of 0's and 1's are uniformly distributed, assuming rand() was used properly to generate the string of bits.

>> **... and compose a random positive 32 bit int**

My train of thought was that "positive" implies that the most significant bit is always 0. Masking with 127/x7F ensures that the most significant bit in the resulting byte is always 0. So that byte needs to be the most significant byte in the signed integer to ensure the result is positive.

Code:

`int random = rand() & 255;`

random = random<<8 | (rand() & 255);

random = random<<8 | (rand() & 255);

random = random<<7 | (rand() & 127);

Here, the most significant bit will always be 0 because there are only 31 bit shifts - so a 1 will never be shifted into that position. So the 127 mask is an error since it causes the 8th bit to always be 0 - not random at all :)

>> **How is the "crammed" version wrong?**

Only the first 2 bytes of the int are ever set. You overlooked the "<<=" nature of the original code I think.

Here's how I would do "random bits in an int"

Code:

`double uniform_deviate(int seed)`

{

return seed * (1.0 / (RAND_MAX + 1.0));

}

int rand_byte()

{

return (int)(uniform_deviate(rand()) * 255 + 0.5); **// wrong! - just use 256, see post #19**

}

int r = rand_byte() << 24 | rand_byte() << 16 | rand_byte() << 8 | rand_byte();

// throw away msb if you want a positive signed int

r &= ~(1 << 31);

gg

Re: Generating big random numbers in C

Quote:

Originally Posted by

**Codeplug**
>> [b]I

It's not hard to eliminate the "maybe" by utilizing all the bits from rand().

That's not enougth. You can only assume all bits are equally likely if RAND_MAX+1 is a power of two. Maybe it is or maybe it isn't. So even if the random number you get from rand() is uniformly distributed the individual bits may not be for certain. They could be afflicted with a modulo bias.

Quote:

My train of thought was that "positive" implies that the most significant bit is always 0. Masking with 127/x7F ensures that the most significant bit in the resulting byte is always 0. So that byte needs to be the most significant byte in the signed integer to ensure the result is positive.

Maybe there's some major oversight on my part but I just don't get what you mean. If I do

int y;

int x = x<<7 | (y & 127);

then x is shifted 7 bits to the left and its 7 rightmost bits are replaced with the 7 rightmost bits of y?

The "crammed" version of my formula is, due to the parentheses, evaluated Horner style. The 8 bits extracted from the leftmost rand() call will be shifted 23 bits to the left and just about miss the signbit of a 32 bit integer. The signbit will be zero resulting in a positive random integer.

Quote:

Here's how I would do "random bits in an int"

Well, you're assuming RAND_MAX is no less than 8 bits aren't you? That's what I did too.

How do the calculations you perform on rand() remove a potential modulo bias? From what I can see you have an implicit modulo 256 operation hidden in the conversion from double back to int. In principle you could as well perform it on the rand() number directly as I did skipping the foggy detour over double. I don't see what your method improves over my suggestion really.

I avoided the "masking the signbit" approach (to turning a signed random number into a positive) because it's not generally applicable. Say you generate random numbers in the {-1, 0, 1} range and each number has a 1/3 chance of coming up. If you "mask the signbit" you will get numbers in the {0, 1} range indeed, it's just that 1 will come up with 2/3 probability and 0 with 1/3. The distribution got skewed.

Finally, after the 32 bit positive random int is generated there's still the problem of limiting it to a range. If modulo N is used we're back where we started with the modulo bias problem.

Re: Generating big random numbers in C

>> **So even if the random number you get from rand() is uniformly distributed the individual bits may not be for certain.**

I'm no mathematician, and I couldn't find an authoritative answer online. My rand_byte() does return a uniformly distributed number from [0,255]. If those bytes are concatenated, are the resulting bits in that bit string also uniformly distributed? I assumed yes at first, but now I'm not sure.

>> **... due to the parentheses ...**

Doh - missed the paren grouping completely. My bad.

>> **Well, you're assuming RAND_MAX is no less than 8 bits aren't you?**

No. The code generates a uniform distribution from [0,255] regardless of the value of RAND_MAX - making it portable.

>> **I don't see what your method improves over my suggestion really.**

The method I've shown preserves the uniform distribution (UD) of rand by mapping the [0, RAND_MAX] int UD, to a [0, 1) double UD. Your code simply takes the low order bits and throws the rest away - which destroys the UD.

The technique is described here: http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx (from post #2)

>> **Finally, after the 32 bit positive random int is generated there's still the problem of limiting it to a range. If modulo N is used we're back where we started with the modulo bias problem.**

There is __no uniform distribution__ in either of our methods for generating the 31 bit int. The "modulo bias problem" is a problem of breaking the distribution. No distribution, then no problem.

gg

Re: Generating big random numbers in C

>> **If those bytes are concatenated, are the resulting bits in that bit string also uniformly distributed?**

Got an answer from sci.stat.math - https://groups.google.com/forum/#!to..._-E/discussion

Quote:

If the integers are uniform on [0, 2^n - 1], where n is a positive integer, then YES.

That makes sense.

I should also point out that even the uniform_deviate() method isn't *perfectly uniform* in all cases:

https://groups.google.com/d/msg/comp...A/m7CckPudJb8J

For reference, here's the "recommended" code for mapping the [0, RAND_MAX] uniform distribution to a [0, N) distribution that maintains *perfect* uniformity (where N <= RAND_MAX - so it doesn't help the Op, but it's good to know):

https://groups.google.com/d/msg/comp...A/WevBqpEv7QEJ

gg

Re: Generating big random numbers in C

Quote:

Originally Posted by

**Codeplug**
The code generates a uniform distribution from [0,255] regardless of the value of RAND_MAX - making it portable.

No it doesn't. Lets assume for the sake of discussion that rand() only generates one random bit in each call. This bit cannot be used to generate 256 random numbers. In order to do that rand() must produce no less than 8 random bits. That is RAND_MAX must not be smaller than 255.

Quote:

The method I've shown preserves the uniform distribution (UD) of rand by mapping the [0, RAND_MAX] int UD, to a [0, 1) double UD. Your code simply takes the low order bits and throws the rest away - which destroys the UD.

That's not correct. As I said your method makes a detour over double but it still involves an implicit modulo 256 operation. When it finally arrives at the 8 bit integer the random qualities hasn't improved on what it started with. The quality is no better than just using 8 random bits directly from the rand() call as I did. The potential modulo bias is still there.

---

In both these case you're making a common mistake. You're hoping that some hokus pokus and black magic performed on a random number will improve its "randomness". It generally doesn't. To achieve that you need to take the number into a chaotic domain (as random number generators do) or generate additional random numbers that can then be used to improve on the first one. An example of the latter is shown in the Java link I supplied where the modulo bias is reduced in nextInt(int n) of Random.