-
June 29th, 2009, 01:29 AM
#1
problem with text parsing function
I am writing a program to search through some text for information. It looks through a line like this
<div class="lvPrice">$50,000</div>
and pulls out the information contained in that line (it transforms the whole line to a cstring, goes through and finds the "<div class="lvPrice">" part then goes through and finds the "<" immediately after this and then prints the cstring elements inbetween them.
The code works when the line is not proceeded by any spaces, but when the line has some spaces, it somehow throws off the printing of the value so it does the wrong thing. For example if the line is indented 1 space as such
<div class="lvPrice">$50,000</div>
then the value printed out will start 1 character earlier in the sequence. so instead of printing "$50,000" like it does in the program with no spaces, the line with 2 spaces would print ">$50,00". Why is this happening and what can I do to fix it??? Thanks for your help in advance
here is the code of the function that is doing the process
Code:
// function to parse a file and add it to a new file
void parseandwrite (string writefile, string searchword)
{
string line;
ifstream myfile (writefile.c_str());
if (myfile.is_open())
{
cout << "VIEW OF FILE " << writefile << endl;
while (! myfile.eof() )
{
getline (myfile,line);
//convert the line to a cstr for easier use
char * linechar, * swordchar;
linechar = new char [line.size()+1];
strcpy(linechar, line.c_str());
swordchar = new char [searchword.size()+1];
strcpy(swordchar, searchword.c_str());
//search through the line for the term and end or info and write info
int linesize, index = 0, found, end;
char read;
linesize = line.size();
for (int k=0; k<linesize; k++)
{
// search for the word using the char by char check and incrementing index
if (linechar[k] == swordchar[index])
{
index++;
}
else index = 0;
//check to see if the word has been found
if (index == searchword.size())
{
found = index;
for (int r=found; r<linesize; r++)
{
if (linechar[r] == '<')
{
end = r;
break;
}
}
cout << "Price:";
for (int g=found; g<end; g++)
{
cout<< linechar[g];
}
cout << endl;
}
}
}
myfile.close();
}
else cout << "Error opening the file" << endl;
}
-
June 29th, 2009, 05:43 AM
#2
Re: problem with text parsing function
Originally Posted by hojoff79
I am writing a program to search through some text for information. It looks through a line like this
<div class="lvPrice">$50,000</div>
and pulls out the information contained in that line (it transforms the whole line to a cstring, goes through and finds the "<div class="lvPrice">" part then goes through and finds the "<" immediately after this and then prints the cstring elements inbetween them.
Why C-Strings? Use std::string::find and std::string::substr instead.
More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity. --W.A.Wulf
Premature optimization is the root of all evil --Donald E. Knuth
Please read Information on posting before posting, especially the info on using [code] tags.
-
June 29th, 2009, 06:48 AM
#3
Re: problem with text parsing function
As treuss mentioned, it will be easier to sue std::string functions.
If you stay with your current setup : you are leaking memory within
the loop (two new's ... no delete's).
-
June 29th, 2009, 06:57 AM
#4
Re: problem with text parsing function
Do you know what is a finite state machine(automaton)? You could use it to parse a string
Last edited by andrey_zh; June 29th, 2009 at 06:59 AM.
-
June 29th, 2009, 12:40 PM
#5
Re: problem with text parsing function
Originally Posted by hojoff79
I am writing a program to search through some text for information. It looks through a line like this
<div class="lvPrice">$50,000</div>
You might want to consider using an xml parser. There are some good, free C++ libraries available.
Cheers, D Drmmr
Please put [code][/code] tags around your code to preserve indentation and make it more readable.
As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky
-
June 29th, 2009, 08:17 PM
#6
Re: problem with text parsing function
I have been using cstrings so that i can using the char[index] index find what character the > ends at and what index the < starts and make a string with everything inbetween (aka, the actual information). I would not know how to do this with string functions (aka, how to single out the information inbetween the <div class="lvPrice"> and the next <
-
June 29th, 2009, 09:25 PM
#7
Re: problem with text parsing function
As treuss mentioned, you would use substr to get the string
between the ">" and the "<" ...
Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
// function to parse a file and add it to a new file
void parseandwrite (const string & writefile, const string & searchword)
{
string line;
ifstream myfile (writefile.c_str());
if (myfile.is_open())
{
cout << "VIEW OF FILE " << writefile << endl;
while ( getline (myfile,line) )
{
size_t pos1 = line.find(searchword); // look for the search word
if (pos1 != string::npos) // if found ...
{
pos1 = line.find(">",pos1); // find the >
if (pos1 != string::npos) // if found ...
{
size_t pos2 = line.find("<",pos1); // find the <
if (pos2 != string::npos) // if found ...
{
cout << searchword << " : " << line.substr(pos1+1,pos2-pos1-1) << "\n";
}
}
}
}
}
else
{
cout << "Error opening the file" << endl;
}
}
int main()
{
parseandwrite ("test.txt", "Price");
return 0;
}
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|