CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 8 of 8
  1. #1
    Join Date
    Mar 2018
    Posts
    3

    Filling an array with random lines from a file

    Hi

    I currently have a program that reads the first 10 lines from a text file and adds them to a vector array, however how would you instead choose lets say 100 random lines from the text file and place those into the vector array.

    Code:
    const char newLine = 10;
    	ifstream file("rockyou.txt");
    	vector<string> arr;
    	for (int i = 0; i < 10; i++) {
    		string temp;
    		file >> temp;
    		arr.push_back(temp);
    	}
    	for (int i = 0; i < 10; i++) {
    		cout << arr[i] << endl;
    	}  
    }
    Thanks for any help you can provide.

  2. #2
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: Filling an array with random lines from a file

    Do you know how many lines there are in the text file? If you do, then you can number the lines from 0 to n-1, where n is the number of lines, and then generate 100 unique random integers within this range. After that, it becomes a matter of reading line by line and copying the corresponding lines to the vector while discarding the rest.

    If you don't know how many lines there are, then one way is to do a two-pass solution, i.e., first you read the text file line by line to count the number of lines, then you proceed as described above.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  3. #3
    Join Date
    Feb 2017
    Posts
    677

    Re: Filling an array with random lines from a file

    Quote Originally Posted by laserlight View Post
    If you don't know how many lines there are, then one way is to do a two-pass solution
    Instead of line numbers, character positions may be used instead. A certain number of random character positions within the file are generated and kept in sorted order.

    A random character position most likely will be somewhere within a line so that line is skipped and the next line is used (and if there is no next line because eof was reached the first line of the file is used instead). This can all be accomplished using the standard functions of std::istream such as seekg, tellg and getline.

    This should be a fast solution since essentially only the randomly selected lines are read and there's just one pass. It may be somewhat more involved to ensure that lines are unique (if that's wanted).
    Last edited by wolle; March 25th, 2018 at 03:30 AM.

  4. #4
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,825

    Re: Filling an array with random lines from a file

    If the size of the file is such that it all will fit into memory, then an alternative is to read the whole file into a vector, randomly shuffle the vector and then use the top n elements as required. Consider

    Code:
    #include <fstream>
    #include <vector>
    #include <string>
    #include <iostream>
    #include <algorithm>
    #include <random>
    using namespace std;
    
    int main()
    {
    	const string fnam = "rockyou.txt";
    	const size_t lines = 20;
    
    	ifstream ifs(fnam);
    
    	if (!ifs.is_open()) {
    		cout << "Cannot open file " << fnam << endl;
    		return 1;
    	}
    
    	vector<string> fildata;
    
    	for (string l; getline(ifs, l); fildata.push_back(l));
    	shuffle(fildata.begin(), fildata.end(), mt19937((random_device())()));
    
    	vector<string> data(fildata.begin(), fildata.begin() + min(lines, fildata.size()));
    
    	for (const auto& v : data)
    		cout << v << endl;
    }
    The vector fildata contains the whole of the file, which is then randomly shuffled using the Mersenne Twister 19937 random generator. The first 'lines' elements are then copied to the vector data for use, if needed - otherwise the shuffled vector fildata could be used.
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  5. #5
    Join Date
    Mar 2018
    Posts
    3

    Re: Filling an array with random lines from a file

    Hi

    Thanks for the reply the file currently contains over 30 million passwords, so I was aiming to randomly select around a hundred of these each time the program starts.

  6. #6
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,825

    Re: Filling an array with random lines from a file

    Quote Originally Posted by Ryan1590 View Post
    Hi

    Thanks for the reply the file currently contains over 30 million passwords, so I was aiming to randomly select around a hundred of these each time the program starts.
    For what purpose? I hope these passwords are not stored 'plaintext' - but are encrypted!
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  7. #7
    Join Date
    Mar 2018
    Posts
    3

    Re: Filling an array with random lines from a file

    Hi

    Its for a program I have been building, currently its a brute force password cracker that will attempt to crack any password that the user enters, however, I want to expand this so the cracker will attempt to find passwords within that 100 sample stored in the array which was taken from the rock you text file, based on parameters the user will change.

  8. #8
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,825

    Re: Filling an array with random lines from a file

    Quote Originally Posted by Ryan1590 View Post
    Hi

    Its for a program I have been building, currently its a brute force password cracker that will attempt to crack any password that the user enters, however, I want to expand this so the cracker will attempt to find passwords within that 100 sample stored in the array which was taken from the rock you text file, based on parameters the user will change.
    As that is not within the ethics of this forum, I'm closing this thread.
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured