CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 13 of 13

Thread: Finding Doubles

  1. #1
    Join Date
    Oct 2021
    Posts
    15

    Finding Doubles

    If I have a piece of working as per below:
    Code:
    default_random_engine engine
    	{ 
    		static_cast<unsigned int>(time(0)) 
    	};
    
    	uniform_int_distribution<unsigned int> randomInt{ 1, 500 };
    
    	for (int i = 1; i < 100; i++) //first loop
    	{
    		for (int j = 1; j < 50; j++) //second loop
    		{
    			cout << randomInt(engine) << " ";
    		}
    		cout << endl;
    	}
    How do I find doubles generated by the second loop?

  2. #2
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: Finding Doubles

    What you mean by 'find doubles'? Can you give an example.
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  3. #3
    Join Date
    Feb 2017
    Posts
    677

    Re: Finding Doubles

    Quote Originally Posted by rockz View Post
    How do I find doubles generated by the second loop?
    Maybe you mean duplicates? Do you want to avoid them? One method is to add each new random number to an std::vector after making sure it is not already in there. Another method is to add the random numbers to a data structure that rejects duplicates, for example, an std::set or an std:: unordered_set. This is fast but the random order of the random numbers gets lost. It can be fixed using std::random_shuffle.

    Note that for your first loop to iterate 100 times and the second 50 times, either the loop variables must start at 0, or the < signs must be <=.
    Last edited by wolle; October 31st, 2021 at 01:20 PM.

  4. #4
    Join Date
    Oct 2021
    Posts
    15

    Re: Finding Doubles

    Quote Originally Posted by wolle View Post
    Maybe you mean duplicates? Do you want to avoid them? One method is to add each new random number to an std::vector after making sure it is not already in there. Another method is to add the random numbers to a data structure that rejects duplicates, for example, an std::set or an std:: unordered_set. This is fast but the random order of the random numbers gets lost. It can be fixed using std::random_shuffle.

    Note that for your first loop to iterate 100 times and the second 50 times, either the loop variables must start at 0, or the < signs must be <=.
    Yes duplicates. I don't want to avoid them, but I want set a counter variable to determine how many duplicates are generated.

  5. #5
    Join Date
    Feb 2017
    Posts
    677

    Re: Finding Doubles

    Quote Originally Posted by rockz View Post
    Yes duplicates. I don't want to avoid them, but I want set a counter variable to determine how many duplicates are generated.
    Then I suggest you use an std::unordered_set,

    https://en.cppreference.com/w/cpp/co.../unordered_set

    You insert each new random integer into the unordered_set using the insert function. You can detect duplicates in two ways. Either you check the return value of insert. If it indicates a reject, you know the random number is a duplicate. Or you use the contains function before the insert function to check whether the random number is already there. If it is, it is a duplicate.

    Another more indirect way is to just insert all random numbers into the unordered_set. Say you insert 50 random numbers. Afterwards, you check how many have actually been inserted using the size function. If it says 48 you know there were 2 duplicates.
    Last edited by wolle; November 1st, 2021 at 02:54 AM.

  6. #6
    Join Date
    Oct 2021
    Posts
    15

    Re: Finding Doubles

    I'm actually doing this for demonstration purpose. I need to collect statistics from the output. So my question is, can it be possible to allow the method to run and then traverse through to find out if there are duplicates? And if there are any duplicates (or even triplicates), have a counter variable to update on the number of instances. I know it can be easily achieved using arrays, but I'm trying my best to avoid arrays. I would like to employ as much features as possible from C++11 and beyond.

  7. #7
    Join Date
    Feb 2017
    Posts
    677

    Re: Finding Doubles

    Quote Originally Posted by rockz View Post
    I know it can be easily achieved using arrays, but I'm trying my best to avoid arrays.
    Duplicity is a relative property between two entities. To compare the current entitity in a sequence with previous entities, one must store them somewhere. Unless, of course, duplicity is an intrinsic property of the entities. In that case, each entity knows it will appear X times in the course of the next N entities of the sequence. That is possible to achieve with computer-generated random numbers because they are deterministic. I am, however, not aware of any standard random number generator that offers this property. It would go against the independence rule stating that the current number should have no relation to previous numbers.

    There seems to be a theoretical divide here. Either duplicity is intrinsic to your entities, or you need to store them.
    Last edited by wolle; November 3rd, 2021 at 05:55 AM.

  8. #8
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: Finding Doubles

    As a possible starter, perhaps:

    Code:
    #include <random>
    #include <unordered_set>
    #include <iostream>
    
    std::mt19937 engine {std::random_device {}()};
    
    int main() {
    	constexpr unsigned l1sze {100};
    	constexpr unsigned l2sze {50};
    
    	std::uniform_int_distribution<unsigned> randomInt {1, 500};
    	std::unordered_set<unsigned> nos;
    	unsigned nodup {};
    
    	for (int i = 1; i < l1sze; ++i)
    		for (int j = 1; j < l2sze; ++j)
    			nodup += !nos.insert(randomInt(engine)).second;
    
    	std::cout << "There were " << nodup << " duplicate numbers\n";
    }
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  9. #9
    Join Date
    Feb 2017
    Posts
    677

    Re: Finding Doubles

    Quote Originally Posted by rockz View Post
    I'm actually doing this for demonstration purpose. I need to collect statistics from the output. So my question is, can it be possible to allow the method to run and then traverse through to find out if there are duplicates?
    Say you draw numbers at random from a range of equally likely numbers. At some point, all numbers have come up at least once. After that, all drawn numbers will be duplicates. With N numbers in the range, that happens after approximately N * ln(N) drawn numbers. See the Coupon collector's problem,

    https://en.wikipedia.org/wiki/Coupon...or%27s_problem

    In your example code, the random range contains N=500 numbers. So after approximately 500 * ln(500) = 3107 drawn numbers on average, all 500 numbers have come up at least once. That is a kind of saturation point or phase shift. From then on, the number of duplicates is not random but deterministic. It is the number of drawn numbers minus the size of the random number range. In your example code, (100-1)*(50-1) = 4851 random numbers are drawn. It exceeds the phase shift point of 3107 with some margin. It means the phase shift is quite likely to take place. And this explains why 2kaud's code very often ends up with 4851-500 = 4351 duplicates.

    My point is that it does not hurt to be aware of the N*ln(N) phase shift when collecting statistics about duplicates. If it continues beyond the phase shift you do not need a simulation to learn the result. It is known beforehand.
    Last edited by wolle; November 3rd, 2021 at 03:18 AM.

  10. #10
    Join Date
    May 2001
    Location
    Germany
    Posts
    1,158

    Re: Finding Doubles

    You could use a std::map<unsigned int, int> with the key the generated number and the value the occurence counter like in this example:
    https://en.cppreference.com/w/cpp/numeric/random

  11. #11
    Join Date
    Oct 2021
    Posts
    15

    Re: Finding Doubles

    Quote Originally Posted by wolle View Post
    Say you draw numbers at random from a range of equally likely numbers. At some point, all numbers have come up at least once. After that, all drawn numbers will be duplicates. With N numbers in the range, that happens after approximately N * ln(N) drawn numbers. See the Coupon collector's problem,

    https://en.wikipedia.org/wiki/Coupon...or%27s_problem
    Thanks for the kind words, really appreciate it. My analogy is not as discussed in the coupoun problem but more of in the Birthday problem, So calculating the probability of finding a double in individual set over mulitple iterations.

    Mathematically it's 50.7% at a size of 23, but i would like to experiment computationally.

  12. #12
    Join Date
    Feb 2017
    Posts
    677

    Re: Finding Doubles

    Quote Originally Posted by rockz View Post
    the Birthday problem
    I posted a solution strategy to this problem quite some time ago now,

    https://forums.codeguru.com/showthre...rthday+problem

    Today, I can just barely understand my post, but it seems to involve hashing .
    Last edited by wolle; November 8th, 2021 at 02:30 AM.

  13. #13
    Join Date
    Oct 2021
    Posts
    15

    Re: Finding Doubles

    Quote Originally Posted by wolle View Post
    I posted a solution strategy to this problem quite some time ago now,

    https://forums.codeguru.com/showthre...rthday+problem

    Today, I can just barely understand my post, but it seems to involve hashing .
    Oh wow!!!. Just Wow!!

    I don't know if anyone managed to solve the problem, but I found a solution done in Pyton on YT. According to the creator of the video, after a 1000 runs, there is a probablity of >50% that there will be a double in each transaction. I found it to be inexact, so I thought to re-code it in cpp.

    I know if I dumped all the data into a <map> it will give me the results I expect, but using <map> may not be the optimal way to achieve the results, minimizing the complexities.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured