Generating big random numbers in C
 CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com

# Thread: Generating big random numbers in C

1. Member
Join Date
Oct 2009
Posts
40

## Generating big random numbers in C

I want to generate big random numbers in C(not C++ please).By "big" I mean integers much bigger than srand(time(NULL)) and rand() functions' limit(32767).

I tried writing: (note:I am not able to see "code" tag button in this editor,so I am not using it)

//****

int randomnumber;
srand( time(NULL) );
randomnumber = (( rand() % 33 ) * ( rand() % 33 ) * ( rand() % 33) * ( rand() * 33) * (rand() % 33 )) + 1

//****

But I have doubts about it's randomness quality.Also there is another problem,the program can't know the maximum random number it should use before user input,so maximum random number may need to use much smaller maximum random number according to user input.

Is there a better algorithm to create quality big random numbers in C?
Last edited by AwArEnEsS; February 15th, 2013 at 08:48 PM.

2. ## Re: Generating big random numbers in C

My recommendation is to read this excellent article and come back here with any questions: http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx

gg

3. ## Re: Generating big random numbers in C

Originally Posted by AwArEnEsS
Is there a better algorithm to create quality big random numbers in C?

4. Elite Member
Join Date
May 2009
Posts
2,413

## Re: Generating big random numbers in C

Originally Posted by AwArEnEsS
Is there a better algorithm to create quality big random numbers in C?
If you want better quality than rand() you must use a better random number generator as has been suggested. But if you're satisfied with the statistical properties of rand() it's easy to use rand() to generate bigger numbers with the same quality.

rand() generates uniformly distributed random numbers (each number appears with equal probability) and the bigger numbers should follow the same distribution. You're on the right track by using rand() repeatedly but unfortunately simply multiplying (or adding) the numbers skews the distribution so it's no longer will be uniform. Do this instead,

Code:
```int random =  rand() & 255;
random = random<<8 | (rand() & 255);
random = random<<8 | (rand() & 255);
random = random<<7 | (rand() & 127);

// or crammed

int random = (((rand() & 255)<<8 | (rand() & 255))<<8 | (rand() & 255))<<7 | (rand() & 127);```
This assumes ints are 32 bits and that rand() generates numbers not smaller than 8 bits (a byte). All in all 31 bits are extracted from four rand() calls and concatenated to form an int. That will give you a random uniformly distributed number withing the full range of a positive 32 bit int.

To change the size you just use the modulo operator. Say you want to limit the random number to be somewhere from 0 up to and including N you add this after the code above,

Code:
`random = random % (N+1);`
Last edited by nuzzle; February 17th, 2013 at 05:15 PM.

5. Member
Join Date
Oct 2009
Posts
40

## Re: Generating big random numbers in C

I was a bit busy and didn't work on programming in last days.

Thank you all for your help,I will try them.

6. Elite Member Power Poster
Join Date
Apr 2000
Location
Belgium (Europe)
Posts
4,626

## Re: Generating big random numbers in C

the problem with using rand() (or any other PRNG function) to generate bigger random numbers than it can really handle out of it's own.

- You will probably reduce the periodicity of the PRNG. So instead of repeating the RNG pattern over X, it will repeat over X/2, X/4, X/8....
- The resultant numbers will probably not be properly distributed PRNG's, it is somewhat unlikely that you will end up generating EVERY possible value in the range, you will likely end up generating numbers where some values cannot ever occur. Or where some numbers will occur significantly more frequent than others.

These issues may or may not be a problem for you.

So it depends what you really need out of this, but the only "true" way is tailoring an existing PRNG function towards the number of bits you need.

7. ## Re: Generating big random numbers in C

>> That will give you a random uniformly distributed number withing the full range of a positive 32 bit int.
That doesn't seem correct to me. For example, to get a number between 0 and 127, rand() would have to return 0 twice (3 times to get "0"). If you see bits being thrown away, or RAND_MAX not being considered, then that should raise a red flag for distribution.

gg

8. Elite Member
Join Date
May 2009
Posts
2,413

## Re: Generating big random numbers in C

Originally Posted by Codeplug
That doesn't seem correct to me. For example, to get a number between 0 and 127, rand() would have to return 0 twice (3 times to get "0"). If you see bits being thrown away, or RAND_MAX not being considered, then that should raise a red flag for distribution.
I don't quite understand the nature of your objection.

I've assumed that 8 uniformly distributed bits can be extracted from each rand() call using a modulo 256 operation. And they approximately will if the random numbers delivered by rand() are drawn from a range reasonably larger than 0-255.

It doesn't matter how high quality the random number is (say it's generated with a Marsenne twister). If a modulo operation is used to limit it a non-uniformity will sneak into the bit distribution regardless.

I didn't say using rand() to generate big random numbers is perfect but it's simple and the quality will be fairly good. If rand() is good enougth in the first place it's good enought for bigger random numbers.

9. ## Re: Generating big random numbers in C

>> I don't quite understand the nature of your objection.
>> ... uniformly distributed number withing the full range of a positive 32 bit int.
My objection is that it's not a uniform distribution - ie. there is not an equal probability of seeing every value in the range [0, 2147483647). Or even [0, N], when you do "random % (N+1)".

BTW, I'm assuming you meant for the "rand() & 127" byte to be the most-significant byte instead of the least. And the shifting in the "// or crammed" part is wrong.

But yeah, it may be "good enough".

gg

10. Elite Member
Join Date
May 2009
Posts
2,413

## Re: Generating big random numbers in C

Originally Posted by Codeplug
>> I don't quite understand the nature of your objection.
>> ... uniformly distributed number withing the full range of a positive 32 bit int.
My objection is that it's not a uniform distribution - ie. there is not an equal probability of seeing every value in the range [0, 2147483647). Or even [0, N], when you do "random % (N+1)".
Well, as I said it doesn't matter how high quality the random integer. If you use modulo N to "cap" it, a small non-uniformity will sneak in. But the severity will be limited as long as N is small in relation to the total range of the random numbers. And the non-uniformity is not anyway near what you will get by multiplying (or adding) random numbers (like the OP did initially). That will totally destroy a uniform distribution rendering the big random numbers useless. The modulo bias problem is small in comparision.

But to be sure the OP could check RAND_MAX and only use say half of the random bits rand() delivers to create the bigger random number. The lowest value of RAND_MAX I've ever seen is 32767. This corresponds to 15 bits so using 8 bits like I suggested is likely to be fine. But putting in an assert to check it is never wrong of course.

BTW, I'm assuming you meant for the "rand() & 127" byte to be the most-significant byte instead of the least. And the shifting in the "// or crammed" part is wrong.
I mean to extract the 7 rightmost bits. That will give a total of 8+8+8+7 = 31 random bits and compose a random positive 32 bit int.

How is the "crammed" version wrong?

But yeah, it may be "good enough".
That's right. Optimally you use a random number generator that delivers a random integer within the wanted range. But just lifting some algorithm from the net is risky too. It may not be as good as the author believs or claims and it may have bugs. At least rand() is time-tested.

It's not that hard to fix the modulo bias problem. This is how it's done in Java. See the nextInt(int n) method of the Random class,

http://docs.oracle.com/javase/7/docs/api/

But Java aside, in my view if the C application isn't important enough to warrant a commersial generator AND to employ an expert to review the statistics then using rand() the way I suggested definately is "good enought". And regardless of how much effort is put in, random numbers generated on digital computers will always be "pseudo" anyway.
Last edited by nuzzle; February 21st, 2013 at 05:32 AM.

11. ## Re: Generating big random numbers in C

>> If you use modulo N to "cap" it, a small non-uniformity will sneak in.
This statement is only true under the following assumptions:
- the PRNG returns uniformly distributed numbers within some range, "[0, RAND_MAX]" for rand()
- N is sufficiently smaller than RAND_MAX (you covered this one...)
- for smaller N's, the PRNG should have good randomness in the low-order bits
- the range is a multiple of N, "(RAND_MAX+1) % N == 0"
http://c-faq.com/lib/randrange.html

>> ... so using 8 bits like I suggested is likely to be fine.
Maybe, maybe not. It's not hard to eliminate the "maybe" by utilizing all the bits from rand().

>> [use rand] to create the bigger random number.
Right, rand() can be used to generate a random string of bits of any length. My original objection was the resulting integers taken from those bits are not uniformly distributed. Only the string of 0's and 1's are uniformly distributed, assuming rand() was used properly to generate the string of bits.

>> ... and compose a random positive 32 bit int
My train of thought was that "positive" implies that the most significant bit is always 0. Masking with 127/x7F ensures that the most significant bit in the resulting byte is always 0. So that byte needs to be the most significant byte in the signed integer to ensure the result is positive.
Code:
```int random =  rand() & 255;
random = random<<8 | (rand() & 255);
random = random<<8 | (rand() & 255);
random = random<<7 | (rand() & 127);```
Here, the most significant bit will always be 0 because there are only 31 bit shifts - so a 1 will never be shifted into that position. So the 127 mask is an error since it causes the 8th bit to always be 0 - not random at all

>> How is the "crammed" version wrong?
Only the first 2 bytes of the int are ever set. You overlooked the "<<=" nature of the original code I think.

Here's how I would do "random bits in an int"
Code:
```double uniform_deviate(int seed)
{
return seed * (1.0 / (RAND_MAX + 1.0));
}

int rand_byte()
{
return (int)(uniform_deviate(rand()) * 255 + 0.5); // wrong! - just use 256, see post #19
}

int r = rand_byte() << 24 | rand_byte() << 16 | rand_byte() << 8 | rand_byte();

// throw away msb if you want a positive signed int
r &= ~(1 << 31);```
gg
Last edited by Codeplug; February 25th, 2013 at 12:24 PM. Reason: bad code

12. Elite Member
Join Date
May 2009
Posts
2,413

## Re: Generating big random numbers in C

Originally Posted by Codeplug
>> [b]I
It's not hard to eliminate the "maybe" by utilizing all the bits from rand().
That's not enougth. You can only assume all bits are equally likely if RAND_MAX+1 is a power of two. Maybe it is or maybe it isn't. So even if the random number you get from rand() is uniformly distributed the individual bits may not be for certain. They could be afflicted with a modulo bias.

My train of thought was that "positive" implies that the most significant bit is always 0. Masking with 127/x7F ensures that the most significant bit in the resulting byte is always 0. So that byte needs to be the most significant byte in the signed integer to ensure the result is positive.
Maybe there's some major oversight on my part but I just don't get what you mean. If I do

int y;
int x = x<<7 | (y & 127);

then x is shifted 7 bits to the left and its 7 rightmost bits are replaced with the 7 rightmost bits of y?

The "crammed" version of my formula is, due to the parentheses, evaluated Horner style. The 8 bits extracted from the leftmost rand() call will be shifted 23 bits to the left and just about miss the signbit of a 32 bit integer. The signbit will be zero resulting in a positive random integer.

Here's how I would do "random bits in an int"
Well, you're assuming RAND_MAX is no less than 8 bits aren't you? That's what I did too.

How do the calculations you perform on rand() remove a potential modulo bias? From what I can see you have an implicit modulo 256 operation hidden in the conversion from double back to int. In principle you could as well perform it on the rand() number directly as I did skipping the foggy detour over double. I don't see what your method improves over my suggestion really.

I avoided the "masking the signbit" approach (to turning a signed random number into a positive) because it's not generally applicable. Say you generate random numbers in the {-1, 0, 1} range and each number has a 1/3 chance of coming up. If you "mask the signbit" you will get numbers in the {0, 1} range indeed, it's just that 1 will come up with 2/3 probability and 0 with 1/3. The distribution got skewed.

Finally, after the 32 bit positive random int is generated there's still the problem of limiting it to a range. If modulo N is used we're back where we started with the modulo bias problem.
Last edited by nuzzle; February 23rd, 2013 at 11:45 AM.

13. ## Re: Generating big random numbers in C

>> So even if the random number you get from rand() is uniformly distributed the individual bits may not be for certain.
I'm no mathematician, and I couldn't find an authoritative answer online. My rand_byte() does return a uniformly distributed number from [0,255]. If those bytes are concatenated, are the resulting bits in that bit string also uniformly distributed? I assumed yes at first, but now I'm not sure.

>> ... due to the parentheses ...
Doh - missed the paren grouping completely. My bad.

>> Well, you're assuming RAND_MAX is no less than 8 bits aren't you?
No. The code generates a uniform distribution from [0,255] regardless of the value of RAND_MAX - making it portable.

>> I don't see what your method improves over my suggestion really.
The method I've shown preserves the uniform distribution (UD) of rand by mapping the [0, RAND_MAX] int UD, to a [0, 1) double UD. Your code simply takes the low order bits and throws the rest away - which destroys the UD.

The technique is described here: http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx (from post #2)

>> Finally, after the 32 bit positive random int is generated there's still the problem of limiting it to a range. If modulo N is used we're back where we started with the modulo bias problem.
There is no uniform distribution in either of our methods for generating the 31 bit int. The "modulo bias problem" is a problem of breaking the distribution. No distribution, then no problem.

gg
Last edited by Codeplug; February 22nd, 2013 at 11:41 AM.

14. Elite Member
Join Date
May 2009
Posts
2,413

## Re: Generating big random numbers in C

Originally Posted by Codeplug
The code generates a uniform distribution from [0,255] regardless of the value of RAND_MAX - making it portable.
No it doesn't. Lets assume for the sake of discussion that rand() only generates one random bit in each call. This bit cannot be used to generate 256 random numbers. In order to do that rand() must produce no less than 8 random bits. That is RAND_MAX must not be smaller than 255.

The method I've shown preserves the uniform distribution (UD) of rand by mapping the [0, RAND_MAX] int UD, to a [0, 1) double UD. Your code simply takes the low order bits and throws the rest away - which destroys the UD.
That's not correct. As I said your method makes a detour over double but it still involves an implicit modulo 256 operation. When it finally arrives at the 8 bit integer the random qualities hasn't improved on what it started with. The quality is no better than just using 8 random bits directly from the rand() call as I did. The potential modulo bias is still there.

---

In both these case you're making a common mistake. You're hoping that some hokus pokus and black magic performed on a random number will improve its "randomness". It generally doesn't. To achieve that you need to take the number into a chaotic domain (as random number generators do) or generate additional random numbers that can then be used to improve on the first one. An example of the latter is shown in the Java link I supplied where the modulo bias is reduced in nextInt(int n) of Random.
Last edited by nuzzle; February 23rd, 2013 at 12:15 PM.

15. Elite Member Power Poster
Join Date
Apr 2000
Location
Belgium (Europe)
Posts
4,626

## Re: Generating big random numbers in C

Originally Posted by Codeplug;2106595I'm no mathematician, and I couldn't find an authoritative answer online. My rand_byte() does return a uniformly distributed number from [0,255
.
it doesn't. it has an under-bias for 0. Because of the +0.5 correction, 0 will appear about half as much as any other number.
there is also an under-bias for 255, because you shifted that half of 0 into that 255. again 255 occurs about half as much as any other number.

using your code, to properly return a 0-255 distribution the rand_byte should be written as
Code:
```int rand_byte()
{
return (int)(uniform_deviate(rand()) * 256);
}```

besides, if RAND_MAX is a power of 2 minus 1 (which about 99% of the PRNG's are) you don't "need" the inefficient detour to floating point number.
Solve things in fixed point integer math by assuming the PRNG is a fixed point valus with a mantisssa of N-bits (where RAND_MAX = (2^N)-1, in the range 0 to 1 exclusive. [0, 1[
Code:
```// returns a number in the range 0 to max-1
short rng(short max)
{
return (short( ((long)rng*max) >> RAND_BITS ); // RAND_BITS is 15 for c's rand()
}```
a version that returns a byte is easy enough to work out from that.

I'm finding it strange that so few sources on RNG's explain this fixed point approach and most go for the inefficient floating point solution.

If those bytes are concatenated, are the resulting bits in that bit string also uniformly distributed? I assumed yes at first, but now I'm not sure.
"probably not"
this will get worse as you extract more bits at a time to chunk into the bigger one.
fewer bits at a time will help, but has the adverse effect of lowering the periodicity of the resulting PRNG even more.

edit: Yes, I realise that by definition of "random", it should, but we're dealing with PSEUDO random sequences here. uniform distribution only counts on a large enough sequence of numbers, Most PRNG's have localised "hotspots" or "artifacts", and when combining those hotspots into a single larger random number, you made the hotspot/artefact worse. This is also why some PRNG's aren't suitable for cryptology or why some aren't good for a monte carlo simulation.

So it "might" work, or might be "good enough" depending on the input prng (rand() is just awful), but making blanket statements of the resulting larger number PRNG won't work well.

No. The code generates a uniform distribution from [0,255] regardless of the value of RAND_MAX - making it portable.
Incorrect. If RAND_MAX is less than 8 bits, you cannot ever produce a uniform distribution that has 2^8 unique values. This is elementary math. The "detour" to a double changes nothing in this, the double will at most evaluate to RAND_MAX unique values.

If you COULD... then the whole problem the OP asks would be pointless, you could just use rand() and assume a new RAND_MAX of whatever size the OP needs to get his bigger range. it works value wise, it's just not uniform. some values will occor multiple times, some will never occur.

So your code can only work if you want to extract fewer bits at a time than RAND_MAX can hold.

>> I don't see what your method improves over my suggestion really.
The method I've shown preserves the uniform distribution (UD) of rand by mapping the [0, RAND_MAX] int UD, to a [0, 1) double UD. Your code simply takes the low order bits and throws the rest away - which destroys the UD.
Wrong, see above.

Your technique does have the merit of using the HIGH bits rather than the LOW bits, which is better. The high bits are more likely to produce a shortened uniform distribution than the lower bits. Explaining this is beyond the scope of this forum. The short version of the story is that while a PRNG is uniform over a large number of values, there tend to be localized "hotspots" over short ranges.

ignoring the "low bits" vs "high bits" issue, which you can work around. both modulo and your apprpoach have the same problem, just localized differently. And it manifests itself if N is not a power of 2.

Suppose you had a rand() that returned numbers in the 0-7 range (3 bits). And you wanted a random number in the range 0-2 inclusive
With modulo
0 will result for rand() values 0, 3, 6
1 will result for rand() values 1, 4, 7
2 will result for rand() values 2, 5
The modulo 'clips' part of the rand() range and puts it all in the lower result values. Some values have a 1 higher probability than others.

your solution will have the same problem (it will bias some resulting values by 1 more than others), but instead of biassing the lower range results, the bias will be spread out over the entire range.

Either of the 2 systems could have advantages depending on what you use the rand() for. One method is not 'better' than the other.
Last edited by OReubens; February 25th, 2013 at 06:18 AM.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•