-
Count word frequency with linked list
I'm supposed to code a program in C++ that queries a user for a file name. That file is then to be scanned and each word is to be placed into a node within a linked list. If the word is already in the list the counter of the respective node should increase by one and there should be no duplicate nodes.
I have coded the program to the best of my ability and got it to compile, however, it crashes when I run it... I've been scratching my head over it for hours and cant seem to get anywhere with it.
Code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
using namespace std;
struct node
{ char word[50]; // word being inputted
int count; // word frequency
node *next; // Pointer to next node
};
node *start = NULL;
void add_node(char nWord[50]);
int main()
{ char cWord[50];
char cFile[30];
FILE *pfTextfile;
node *temp; //temporary node to store data
node *current; //node to traverse with
int TravFlag = 0; //Flag to help traverse
cout << "Enter the file you wish to have searched:\n";
cin >> cFile;
// open file
pfTextfile = fopen(cFile, "r");
while(!feof(pfTextfile))
{ TravFlag=0;
//empty cWord
for (int i=0; i<50; i++)
{ cWord[i]='\0';
}
current = start; //node that points to beginning
// make temp first node
if (start == NULL)
{start = temp;
start->word[50]=cWord[50];
start->count=1;
}
else if(start!=NULL) //if theres already a node
{ while(TravFlag==0)//Loop to Traverse
{if(current->word==cWord)//If current node is same word
{current->count=current->count+1;//count increases by 1
TravFlag=1; //set flag to end loop
}
else if(current->word!=cWord&¤t->next==NULL)
{//end of the list
add_node(cWord);
TravFlag=1;
}
else
current = current->next; //move to next node
}
}
}
//Print em
current=start;
do
{if(current==NULL)
cout<<"End of list."<<endl;
else
{ cout<< "The word "<<current->word<<" appears "<<current->count<<" times."<<endl;
current=current->next;
}
}
while(current!=NULL);
}
void add_node(char nWord[50])
{ node *temp, *temp2; // Temporary pointers
temp = new node;
temp->word[50]=nWord[50];
temp->count=1;
temp->next = NULL;
// Set up link to this node
if (start == NULL)
start = temp;
else
{ temp2 = start;
while (temp2->next != NULL)
{ temp2 = temp2->next; // Move to next node
}
temp2->next = temp;
}
}
I'm pretty new to this stuff so any help at all would be fantastic!
Thanks!
-
Re: Count word frequency with linked list
Do you have to make the list yourself. This is c++ not c and in c++ we have std::list<type> which suffices for almost all linked list needs.
Personally i think a list is a bad idea because the check for duplicates is very slow as you must start at the head of the list and compare each node in turn until you find duplicate or reach the end of the list. This is far better done with a tree structure which can be searched a lot quicker.
#include <stdio.h>
#include <stdlib.h>
are C headers and have no place in a c++ program. <cstdio> and <cstdlib> are the c++ versions.
-
Re: Count word frequency with linked list
No, I dont need to make the list myself as long as it holds the word and the amount of times it appears. I guess if it also includes similar or appropriate functions I may even be better off...
Did you see any mistakes in my logic that would make the program crash though? I am likely to still use similar logic even when changing the list to the standard C++ version.
-
Re: Count word frequency with linked list
If you run it in the debugger it will show you exactly where it crashes and most likely why.
-
Re: Count word frequency with linked list
Ah, I see... I've never done anything like that before. Thanks!
It says that I have a segmentation fault at the line of code that reads
Code:
start->word[50]=cWord[50];
Does a segmentation fault mean that there is no value being assigned?
I also added this line in because I realized I never stored data into cWord.
Code:
fscanf(pfTextfile, "%s", cWord);
-
Re: Count word frequency with linked list
For an array of size 50, valid indexes are between 0 and 49.
-
Re: Count word frequency with linked list
Never use feof() to test for end of file. Its a bug to do that. You should use the return value from your file reading to control the loop.
-
Re: Count word frequency with linked list
Ok, I should have put a loop in to assign the different chars to the array which means I should have a word size count...
I tried this and still have the same error on the same line.
Code:
//read the next word from file, store as cWord
fscanf(pfTextfile, "%s", cWord);
current = start; //node that points to beginning
WordSize = 0;
//get the size of the word
while (cWord[WordSize]!='\0')
{
WordSize++;
}
// make temp first node
if (start == NULL)
{start = temp;
for(int wd=0;wd<WordSize;wd++)
{ start->word[wd]=cWord[wd];
}
start->count=1;
}
And to get around the feof() issue, all I need to do is set a condition to terminate the while loop if I get a '\0' character into cWord?
-
Re: Count word frequency with linked list
-
Re: Count word frequency with linked list
To get around the feof issue doesn't take much....
do something like..
while ( fscanf( pfTextfile, "%s" , cWord ) != EOF )
as I said control the loop with your file read function and not feof.
As an aside, try to work out exactly why feof is a bug!
-
Re: Count word frequency with linked list
Thanks, that saves me a few lines of code. I just need to get past the segmentation error now...
Could it be that a comma ',' as the last character in the array of chars is making the system throw this error?
-
Re: Count word frequency with linked list
Quote:
Originally Posted by
arbor33
Thanks, that saves me a few lines of code. I just need to get past the segmentation error now...
Could it be that a comma ',' as the last character in the array of chars is making the system throw this error?
I told you what the error is in post 6.
-
Re: Count word frequency with linked list
Oh, sorry. I did change that to a loop that I mentioned above so that I could actually assign the complete value of cWord and not just the last char.
It now looks like this and still gives the same error.
Code:
if (start == NULL)
{start = temp;
for(int wd=0;wd<strlen(cWord);wd++)
{ start->word[wd]=cWord[wd];
}
start->count=1;
}
-
Re: Count word frequency with linked list
Quote:
Originally Posted by
arbor33
Oh, sorry. I did change that to a loop that I mentioned above so that I could actually assign the complete value of cWord and not just the last char.
It now looks like this and still gives the same error.
Code:
if (start == NULL)
{start = temp;
for(int wd=0;wd<strlen(cWord);wd++)
{ start->word[wd]=cWord[wd];
}
start->count=1;
}
You're doing too much work here. Just use strcpy to copy a null terminated string.
-
Re: Count word frequency with linked list
Okay, I've changed it to this
Code:
strcpy(start->word,cWord);
I'm still getting the segmentation problem but now it doesnt tell me where. What a headache this has become...
-
Re: Count word frequency with linked list
Quote:
Originally Posted by
arbor33
Okay, I've changed it to this
Code:
strcpy(start->word,cWord);
I'm still getting the segmentation problem but now it doesnt tell me where. What a headache this has become...
It has to give you a line causing the error. You need to look around and see which of your variables is out of whack.
-
Re: Count word frequency with linked list
Quote:
No, I dont need to make the list myself as long as it holds the word and the amount of times it appears.
Then consider using std::list.
Also, you are using C++'s std::cin & std::cout, but you're ignoring std::ifstream and std::string! Using these would simplify much of the code you have written.
The method that most C++ coders would probably use in 'real life' would be to use std::map where the key is the word in question and the value is the count.
Code:
std::map<std::string, int> word_counts;
ifstream file("somefile.txt");
std::string word;
file >> word;
++word_counts[word];
-
Re: Count word frequency with linked list
Try posting again your latest revision to your code so we can check it over for you.
Personally i would start with a std::list<std::string> and not a homegrown implementation of a list and char*/char[]. Once you have that running properly, then maybe you can look at substituting a homegrown list for the std::list and when thats running fine, finally work out how to turn your std::strings into old c style strings. Its pretty easy to write using the standard library lists and strings and really if you are learning c++ these should be introduced long before you learn to roll your own implementations or play with c style strings.
I suggest you try get a copy of accelerated c++ which tries to teach the language the right way instead of trying to teach C then teach how C++ differs from C.
-
Re: Count word frequency with linked list
Code:
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <string.h>
using namespace std;
struct node
{ char word[50]; // word being inputted
int count; // word frequency
node *next; // Pointer to next node
};
node *start = NULL;
void add_node(char nWord[50]);
int main()
{ char cWord[50];
char cFile[30];
FILE *pfTextfile;
node *temp; //temporary node to store data
node *current; //node to traverse with
int TravFlag = 0;
cout << "Enter the file you wish to have searched:\n";
cin >> cFile;
// open file
pfTextfile = fopen(cFile, "r");
// while ( fscanf(filepoint, "%s" , cWord ) != EOF ) was skipping the first word
while(cWord[strlen(cWord)]!=EOF)
{if(cWord[strlen(cWord)]==EOF)
break;
TravFlag=0;
//clear out cWord
for (int i=0; i<50; i++)
{ cWord[i]='\0';
}
//read from file store in cWord
fscanf(pfTextfile, "%s", cWord);
current = start; //node that points to beginning
//Get rid of punctuation
while (cWord[strlen(cWord)-1]=='.' || cWord[strlen(cWord)-1]=='\"' || cWord[strlen(cWord)-1]=='!' ||
cWord[strlen(cWord)-1]==',' || cWord[strlen(cWord)-1]=='?')
{cWord[strlen(cWord)-1]='\0';
}
cout<<cWord<<endl;
// make temp first node
if (start == NULL)
{start = temp;
strcpy(start->word,cWord);
start->count=1;
}
else if(start!=NULL) //if theres already a node
{ while(TravFlag==0)//Loop to Traverse
{if(current->word==cWord)//If current node is same word
{current->count=current->count+1;//count increases by 1
TravFlag=1; //set flag to end loop
}
else if(current->word!=cWord&¤t->next==NULL)
{//end of the list
add_node(cWord);
TravFlag=1;
}
else
current = current->next; //move to next node
}
}
}
fclose(pfTextfile);
//Print em
current=start;
do
{if(current==NULL)
cout<<"End of list."<<endl;
else
{ cout<< "The word "<<current->word<<" appears "<<current->count<<" times."<<endl;
current=current->next;
}
}
while(current!=NULL);
}
void add_node(char nWord[50])
{ node *temp, *temp2; // Temporary pointers
temp = new node;
strcpy(temp->word,nWord);
temp->count=1;
temp->next = NULL;
// Set up link to this node
if (start == NULL)
start = temp;
else
{ temp2 = start;
while (temp2->next != NULL)
{ temp2 = temp2->next; // Move to next node
}
temp2->next = temp;
}
}
I really appreciate everyones advice and support. I'm surprised my professor never mentioned these standard lists though...
-
Re: Count word frequency with linked list
Quote:
Originally Posted by
arbor33
I really appreciate everyones advice and support. I'm surprised my professor never mentioned these standard lists though...
Just for the record, here's what the standard template library can do for you. (This should be equivelent to your code.)
Code:
#include <iostream>
#include <map>
#include <fstream>
#include <string>
using namespace std;
typedef map<string, int> word_count_list;
int main()
{
word_count_list word_count;
string filename;
// Get the filename.
cout << "Enter the file you wish to have searched:\n";
cin >> filename;
// Open file.
ifstream file(filename.c_str());
// Read in all the words.
string word;
while (file >> word)
{
// Remove punctuation.
int index;
while ((index = word.find_first_of(".,!?\\")) != string::npos)
{
word.erase(index, 1);
}
++word_count[word];
}
// Print out the word counts.
word_count_list::const_iterator current(word_count.begin());
while (current != word_count.end())
{
cout << "The word '" << current->first << "' appears " << current->second <<" times." << endl;
++current;
}
}
-
Re: Count word frequency with linked list
Oh wow. Thanks so much! I'm going to have to read into all this on my own or just ask my teacher to cover things that are so useful like this.
-
Re: Count word frequency with linked list
From my experience on these forums, it's quite clear that the teaching of C++ in colleges and universities is not always very good. It's no wonder that students often come away thinking that Java is 'really cool' and C++ is just a 'legacy' language.
The STL has been around for over 10 years, though it's taken a while for compilers to catch up properly.
When Java is being taught, I bet they don't expect you to ignore the standard containers and strings, so why should you in C++?
I think that the STL should be introduced early, showing the concepts of containers, iterators, algorithms and functors and then later, once the concepts have been understood, exploring how they may be implemented. I think it's much better to learn that pointers are a type of iterator, albeit reduced functionality, rather than learn the intricacies of pointers first and then trying to get your head around the idea that STL iterators are not pointers!
-
Re: Count word frequency with linked list
Quote:
Originally Posted by
JohnW@Wessex
When Java is being taught, I bet they don't expect you to ignore the standard containers and strings,
In Java, it is required, both as a student and a professional to not use hand-coded solutions when libraries from Sun are available. It's the exact opposite with C++, where for some reason, you're considered an "ace programmer" (usually by other programmers suffering from NIH syndrome) when you do everything by hand.
For a description of NIH:
http://en.wikipedia.org/wiki/Not_Invented_Here
Regards,
Paul McKenzie