Determining the number of unique words in a .txt file
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 22

Thread: Determining the number of unique words in a .txt file

  1. #1
    Join Date
    Oct 2008
    Posts
    25

    Determining the number of unique words in a .txt file

    Hey I have to write a program that reads a text file that contains a list of words, 1 word per line. I have to store the unique words and count the occurrences of each unique word. When the file is completely read, I have to print the words and the number of occurrences to a text file. The output should be the words in alphabetical order along with the number of times they occur. Then print to the file some statistics:

    I have to use character arrays instead of strings.
    I must use the linear search (something that looks like this)
    Code:
    int search (int list [], int size, int key)
    {
        int pos = 0;
        while (pos < size && list[pos] != key)
            pos++;
        if (pos == size)
            pos = -1;
        return pos;
    }
    to determine if a word is in the array. The array is an array of structures and that the key is a char array so the string comparison must be used. The search task should be a separate function.
    The search must be a separate function that returns an integer values. I cant use a for loop and the function must have only one return statement.

    I'm starting off by trying to read and store the words of the text file. My text book has code for reading line by line a text file and then displaying it. So I figured I should start with that. Heres the code-
    Code:
    #include <iostream>
    #include <fstream>
    #include <cstdlib>   // needed for exit()
    #include <string>
    using namespace std;
    
    int main()
    {
      string filename = "text.dat";  // put the filename up front
      string line;
      ifstream inFile;
      
      inFile.open(filename.c_str());
    
      if (inFile.fail())  // check for successful open
      {
        cout << "\nThe file was not successfully opened"
    	 << "\n Please check that the file currently exists."
    	 << endl;
        exit(1);
      }
    
      // read and display the file's contents
      while (getline(inFile,line))
        cout << line << endl;
    
      inFile.close(); 
    
      cin.ignore();  // this line is optional
    
      return 0;
    }
    Heres my code split into two functions, however it does not compile. I get an error at the while statement.-

    Code:
    #include <iostream>
    #include <fstream>
    #include <cstdlib>
    #include <string>
    #include <iomanip>
    using namespace std;
    
    void displayFile( char []);
    
    void main ()
    {
    	int const wordLength = 21;
    	int const Num = 101;
    	int const fileSize = 255;
    	char filename[fileSize];
    	
    	cout << "Please enter the name of the file you wish to open: "<< endl;
    	cin.getline(filename,fileSize);
    	
    	displayFile (filename);
    	cin.ignore();
    }
    
    void displayFile (char fileName[] )
    {
        ifstream inFile;
    	
        char line [101];
    	
    	inFile.open(fileName);
    
     while (getline(inFile, line))
        cout << line << endl;
    
    	inFile.close(); 
    
    	return 0;
    }#include <iostream>
    #include <fstream>
    #include <cstdlib>
    #include <string>
    #include <iomanip>
    using namespace std;
    
    void displayFile( char []);
    
    void main ()
    {
    	int const wordLength = 21;
    	int const Num = 101;
    	int const fileSize = 255;
    	char filename[fileSize];
    	
    	cout << "Please enter the name of the file you wish to open: "<< endl;
    	cin.getline(filename,fileSize);
    	
    	displayFile (filename);
    	cin.ignore();
    }
    
    void displayFile (char fileName[] )
    {
        ifstream inFile;
    	
        char line [101];
    	
    	inFile.open(fileName);
    
     while (getline(inFile, line))
        cout << line << endl;
    
    	inFile.close(); 
    
    	return 0;
    }
    My code is nearly identical except for the fact that I use character arrays instead of strings, so I'm not sure why its not compiling.
    Also, is this a good way to start off the program? Or should I try something else.

  2. #2
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Fairfax, VA
    Posts
    10,885

    Re: Determining the number of unique words in a .txt file

    getline(inFile, line) doesn't exist for character arrays----that's a special overload for std::strings. The one you want for a char array looks like inFile.getline(line,length). Reference here:
    http://www.cplusplus.com/reference/i...m/getline.html

    I have to use character arrays instead of strings.
    I must use the linear search (something that looks like this)
    Shame, without these restrictions you could write the program in about 10 lines using a std::map.

  3. #3
    Join Date
    Nov 2003
    Posts
    1,405

    Re: Determining the number of unique words in a .txt file

    The simplest solution is to use a counter for each word.

    The easiest way to implement this is by the use of an associate array (a map). You simply match,

    word -> counter

    So you walk through all words and increment the counter for each occurance of a word using a map.

  4. #4
    Join Date
    Oct 2008
    Posts
    25

    Re: Determining the number of unique words in a .txt file

    Thanks, However, I cant use maps or vectors. I'm still having trouble storing the words I find in the text file. Heres my function-
    Code:
    void displayFile (char fileName[], words array[] )
    {
    	int i = 0;
        ifstream inFile;
    	
        char line [101];
    	
    	inFile.open(fileName);
    
     while (inFile.getline(line,101))
     {   
    	cout << line << endl;
        array[i].word = line;
        i++;
     }   
    	inFile.close(); 
    
    }
    Where array, is an array of structs made up of character array"word" and an integer "count". Any help on storing the words and the number of words in the array would be helpfull.
    Last edited by matt_570; December 3rd, 2008 at 08:02 PM. Reason: New post

  5. #5
    Join Date
    Nov 2003
    Posts
    1,405

    Re: Determining the number of unique words in a .txt file

    Quote Originally Posted by matt_570 View Post
    Thanks, However, I cant use maps or vectors.
    Yes you can. You may have to implement them yourself though.

  6. #6
    Join Date
    Oct 2008
    Posts
    25

    Re: Determining the number of unique words in a .txt file

    I might be using the wrong words but, I cant use strings at all, I need to use character arrays instead. So i cant use std::map or any std::
    I'm not sure what std::map does, I never heard of maps (its not in my text book at all). The first time I heard it was asking this question on another board.

  7. #7
    Join Date
    Nov 2003
    Posts
    1,405

    Re: Determining the number of unique words in a .txt file

    Quote Originally Posted by matt_570 View Post
    I might be using the wrong words but, I cant use strings at all, I need to use character arrays instead. So i cant use std::map or any std::
    I'm not sure what std::map does, I never heard of maps (its not in my text book at all). The first time I heard it was asking this question on another board.
    Well, then listen to what I already told you.

    You need to be able to associate each word with a counter.

    It's this association,

    word -> counter

    Now implement it. Are you stupid or what?

  8. #8
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Fairfax, VA
    Posts
    10,885

    Re: Determining the number of unique words in a .txt file

    Perhaps you'd better show us what you're passing it for the array parameter. As you have it you'll be limited by a maximum number of words----are you given such an allowance? Or is the teacher expecting you to use a dynamic-sized array here?

  9. #9
    Join Date
    Oct 2008
    Posts
    25

    Re: Determining the number of unique words in a .txt file

    Quote Originally Posted by _uj View Post
    Well, then listen to what I already told you.

    You need to be able to associate each word with a counter.

    It's this association,

    word -> counter

    Now implement it. Are you stupid or what?
    Thanks for the encouragement , I bet you never had to takes classes on this stuff, you just knew it.
    I'm trying to make the line 1 associated with the array[0].count where count is an integer, but I'm running into the same problem with trying to store the line in a character array. I get a syntax error of ( "=": left operand must be a 1-vaule).

    The maximun number of words is 100, with a maximun word length of 20.
    Heres my whole code -

    Code:
    #include <iostream>
    #include <fstream>
    #include <cstdlib>
    #include <string>
    #include <iomanip>
    using namespace std;
    
    	struct words
    	{
    		char word[21];
    		int count;	
    	};
    
    void displayFile( char [], words []);
    
    void main ()
    {
    	int const wordLength = 21;
    	int const Num = 101;
    	int const fileSize = 255;
    	char filename[fileSize];
    	
    	cout << "Please enter the name of the file you wish to open: "<< endl;
    	cin.getline(filename,fileSize);
    	
    
    	
    	words array[101];
    	
    	displayFile (filename, array);
    	cin.ignore();
    }
    
    void displayFile (char fileName[], words array[] )
    {
    	int i = 0;
        ifstream inFile;
        char line [101];
    	
    	inFile.open(fileName);
    
     while (inFile.getline(line,101))
     {   
    	cout << line << endl;
        line = array[i].word;
        i++;
     }   
    	inFile.close(); 
    
    }

  10. #10
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Fairfax, VA
    Posts
    10,885

    Re: Determining the number of unique words in a .txt file

    Code:
        line = array[i].word;
    Two problems here: It's left-right swapped (assignment goes from right to left), and arrays are not assignable in this way. You should be using strncpy() here, with n = 20 and an explicit "array[i].word[20] = 0" statement *just* to make sure that the thing is zero-terminated. Your code will of course break (but not horribly) if any line contains more than 20 characters, or if there are more than 101 lines.

    "Breaking" by dropping some of the input is far preferable to "breaking" by overwriting random memory, of course. So always code with that in mind. One of the many reasons why std::string is awesome is because it doesn't have such limitations.

  11. #11
    GCDEF is offline Elite Member Power Poster
    Join Date
    Nov 2003
    Posts
    11,980

    Re: Determining the number of unique words in a .txt file

    Quote Originally Posted by _uj View Post

    Now implement it. Are you stupid or what?
    That's not appropriate.

  12. #12
    Join Date
    Oct 2008
    Posts
    25

    Re: Determining the number of unique words in a .txt file

    Thank you that worked. I'm now trying to use strncmp to find out the number of unique words (or lines) in the text file. I tried a couple of things, but none seemed to work. I'm given these guideline-


    You must use the linear search algorithm to determine if a word is in the array. Remember that the array is an array of structures and that the key is a string (char array) so the string comparison must be used. The search task should be a separate function.
    The search must be a separate function that returns an integer values. Do not use a for loop and the function must have only one return statement.

    Heres the instructors linear search-
    Code:
    int search (int list [], int size, int key)
    {
        int pos = 0;
        while (pos < size && list[pos] != key)
            pos++;
        if (pos == size)
            pos = -1;
        return pos;
    }
    I'm really having trouble on this, any help would be appreciated.

  13. #13
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Fairfax, VA
    Posts
    10,885

    Re: Determining the number of unique words in a .txt file

    It seems like that function would work just fine, with the type of list altered and the != check replaced by the appropriate strncmp call.

  14. #14
    Join Date
    Oct 2008
    Posts
    25

    Re: Determining the number of unique words in a .txt file

    I'm running into the same problem as I did before. My array is an array of stuctures consisting of a charater array and integer.

    Code:
    int search (int list [], int size, int key)
    {
        int pos = 0;
        while (pos < size && list[pos] != key)
            pos++;
        if (pos == size)
            pos = -1;
        return pos;
    }
    I assume the integer in the struct is size, he tells me the array is an array of structs, and that the key is the character array. So it seems like I'm supposing to be comparing an array of structs to an array of characters. I tried to code it-
    Code:
        int pos = 0;
        while (pos < 8 && (strncmp(array[pos].words, array[pos])  != 0))
            pos++;
        if (pos == size)
            pos = -1;
        cout<< pos<< endl;
    but I get a syntax. Like before, I'm running inton problems on the comparision.

  15. #15
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Fairfax, VA
    Posts
    10,885

    Re: Determining the number of unique words in a .txt file

    Those three parameters are:
    list, which is your array of structs. (Type will be whatever your struct is called with []).
    size, which is the length of that array. (Type will be int.)
    key, which is what you're looking for. (Type will be char[]).

    The teacher's code compares something in list against key. Yours does not.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center