
October 31st, 2021, 06:11 AM
#1
Finding Doubles
If I have a piece of working as per below:
Code:
default_random_engine engine
{
static_cast<unsigned int>(time(0))
};
uniform_int_distribution<unsigned int> randomInt{ 1, 500 };
for (int i = 1; i < 100; i++) //first loop
{
for (int j = 1; j < 50; j++) //second loop
{
cout << randomInt(engine) << " ";
}
cout << endl;
}
How do I find doubles generated by the second loop?

October 31st, 2021, 11:39 AM
#2
Re: Finding Doubles
What you mean by 'find doubles'? Can you give an example.
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only  not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums  and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.2.1)

October 31st, 2021, 12:50 PM
#3
Re: Finding Doubles
Originally Posted by rockz
How do I find doubles generated by the second loop?
Maybe you mean duplicates? Do you want to avoid them? One method is to add each new random number to an std::vector after making sure it is not already in there. Another method is to add the random numbers to a data structure that rejects duplicates, for example, an std::set or an std:: unordered_set. This is fast but the random order of the random numbers gets lost. It can be fixed using std::random_shuffle.
Note that for your first loop to iterate 100 times and the second 50 times, either the loop variables must start at 0, or the < signs must be <=.
Last edited by wolle; October 31st, 2021 at 01:20 PM.

October 31st, 2021, 07:14 PM
#4
Re: Finding Doubles
Originally Posted by wolle
Maybe you mean duplicates? Do you want to avoid them? One method is to add each new random number to an std::vector after making sure it is not already in there. Another method is to add the random numbers to a data structure that rejects duplicates, for example, an std::set or an std:: unordered_set. This is fast but the random order of the random numbers gets lost. It can be fixed using std::random_shuffle.
Note that for your first loop to iterate 100 times and the second 50 times, either the loop variables must start at 0, or the < signs must be <=.
Yes duplicates. I don't want to avoid them, but I want set a counter variable to determine how many duplicates are generated.

November 1st, 2021, 02:34 AM
#5
Re: Finding Doubles
Originally Posted by rockz
Yes duplicates. I don't want to avoid them, but I want set a counter variable to determine how many duplicates are generated.
Then I suggest you use an std::unordered_set,
https://en.cppreference.com/w/cpp/co.../unordered_set
You insert each new random integer into the unordered_set using the insert function. You can detect duplicates in two ways. Either you check the return value of insert. If it indicates a reject, you know the random number is a duplicate. Or you use the contains function before the insert function to check whether the random number is already there. If it is, it is a duplicate.
Another more indirect way is to just insert all random numbers into the unordered_set. Say you insert 50 random numbers. Afterwards, you check how many have actually been inserted using the size function. If it says 48 you know there were 2 duplicates.
Last edited by wolle; November 1st, 2021 at 02:54 AM.

November 1st, 2021, 03:00 AM
#6
Re: Finding Doubles
I'm actually doing this for demonstration purpose. I need to collect statistics from the output. So my question is, can it be possible to allow the method to run and then traverse through to find out if there are duplicates? And if there are any duplicates (or even triplicates), have a counter variable to update on the number of instances. I know it can be easily achieved using arrays, but I'm trying my best to avoid arrays. I would like to employ as much features as possible from C++11 and beyond.

November 1st, 2021, 05:09 AM
#7
Re: Finding Doubles
Originally Posted by rockz
I know it can be easily achieved using arrays, but I'm trying my best to avoid arrays.
Duplicity is a relative property between two entities. To compare the current entitity in a sequence with previous entities, one must store them somewhere. Unless, of course, duplicity is an intrinsic property of the entities. In that case, each entity knows it will appear X times in the course of the next N entities of the sequence. That is possible to achieve with computergenerated random numbers because they are deterministic. I am, however, not aware of any standard random number generator that offers this property. It would go against the independence rule stating that the current number should have no relation to previous numbers.
There seems to be a theoretical divide here. Either duplicity is intrinsic to your entities, or you need to store them.
Last edited by wolle; November 3rd, 2021 at 05:55 AM.

November 1st, 2021, 05:28 AM
#8
Re: Finding Doubles
As a possible starter, perhaps:
Code:
#include <random>
#include <unordered_set>
#include <iostream>
std::mt19937 engine {std::random_device {}()};
int main() {
constexpr unsigned l1sze {100};
constexpr unsigned l2sze {50};
std::uniform_int_distribution<unsigned> randomInt {1, 500};
std::unordered_set<unsigned> nos;
unsigned nodup {};
for (int i = 1; i < l1sze; ++i)
for (int j = 1; j < l2sze; ++j)
nodup += !nos.insert(randomInt(engine)).second;
std::cout << "There were " << nodup << " duplicate numbers\n";
}
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only  not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums  and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.2.1)

November 2nd, 2021, 02:29 PM
#9
Re: Finding Doubles
Originally Posted by rockz
I'm actually doing this for demonstration purpose. I need to collect statistics from the output. So my question is, can it be possible to allow the method to run and then traverse through to find out if there are duplicates?
Say you draw numbers at random from a range of equally likely numbers. At some point, all numbers have come up at least once. After that, all drawn numbers will be duplicates. With N numbers in the range, that happens after approximately N * ln(N) drawn numbers. See the Coupon collector's problem,
https://en.wikipedia.org/wiki/Coupon...or%27s_problem
In your example code, the random range contains N=500 numbers. So after approximately 500 * ln(500) = 3107 drawn numbers on average, all 500 numbers have come up at least once. That is a kind of saturation point or phase shift. From then on, the number of duplicates is not random but deterministic. It is the number of drawn numbers minus the size of the random number range. In your example code, (1001)*(501) = 4851 random numbers are drawn. It exceeds the phase shift point of 3107 with some margin. It means the phase shift is quite likely to take place. And this explains why 2kaud's code very often ends up with 4851500 = 4351 duplicates.
My point is that it does not hurt to be aware of the N*ln(N) phase shift when collecting statistics about duplicates. If it continues beyond the phase shift you do not need a simulation to learn the result. It is known beforehand.
Last edited by wolle; November 3rd, 2021 at 03:18 AM.

November 4th, 2021, 11:23 AM
#10
Re: Finding Doubles
You could use a std::map<unsigned int, int> with the key the generated number and the value the occurence counter like in this example:
https://en.cppreference.com/w/cpp/numeric/random

November 7th, 2021, 04:54 AM
#11
Re: Finding Doubles
Originally Posted by wolle
Say you draw numbers at random from a range of equally likely numbers. At some point, all numbers have come up at least once. After that, all drawn numbers will be duplicates. With N numbers in the range, that happens after approximately N * ln(N) drawn numbers. See the Coupon collector's problem,
https://en.wikipedia.org/wiki/Coupon...or%27s_problem
Thanks for the kind words, really appreciate it. My analogy is not as discussed in the coupoun problem but more of in the Birthday problem, So calculating the probability of finding a double in individual set over mulitple iterations.
Mathematically it's 50.7% at a size of 23, but i would like to experiment computationally.

November 8th, 2021, 02:24 AM
#12
Re: Finding Doubles
Originally Posted by rockz
the Birthday problem
I posted a solution strategy to this problem quite some time ago now,
https://forums.codeguru.com/showthre...rthday+problem
Today, I can just barely understand my post, but it seems to involve hashing .
Last edited by wolle; November 8th, 2021 at 02:30 AM.

November 8th, 2021, 08:58 PM
#13
Re: Finding Doubles
Originally Posted by wolle
Oh wow!!!. Just Wow!!
I don't know if anyone managed to solve the problem, but I found a solution done in Pyton on YT. According to the creator of the video, after a 1000 runs, there is a probablity of >50% that there will be a double in each transaction. I found it to be inexact, so I thought to recode it in cpp.
I know if I dumped all the data into a <map> it will give me the results I expect, but using <map> may not be the optimal way to achieve the results, minimizing the complexities.
Posting Permissions
 You may not post new threads
 You may not post replies
 You may not post attachments
 You may not edit your posts

Forum Rules

Click Here to Expand Forum to Full Width
