problem with text parsing function
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 7 of 7

Thread: problem with text parsing function

  1. #1
    Join Date
    Jun 2009
    Posts
    40

    problem with text parsing function

    I am writing a program to search through some text for information. It looks through a line like this

    <div class="lvPrice">$50,000</div>

    and pulls out the information contained in that line (it transforms the whole line to a cstring, goes through and finds the "<div class="lvPrice">" part then goes through and finds the "<" immediately after this and then prints the cstring elements inbetween them.

    The code works when the line is not proceeded by any spaces, but when the line has some spaces, it somehow throws off the printing of the value so it does the wrong thing. For example if the line is indented 1 space as such

    <div class="lvPrice">$50,000</div>

    then the value printed out will start 1 character earlier in the sequence. so instead of printing "$50,000" like it does in the program with no spaces, the line with 2 spaces would print ">$50,00". Why is this happening and what can I do to fix it??? Thanks for your help in advance

    here is the code of the function that is doing the process

    Code:
    // function to parse a file and add it to a new file
    void parseandwrite (string writefile, string searchword)
    {
         
         string line;
         
         ifstream myfile (writefile.c_str());
      if (myfile.is_open())
      {
      cout << "VIEW OF FILE " << writefile << endl;
        while (! myfile.eof() )
        {
          getline (myfile,line);
          
          //convert the line to a cstr for easier use
          char * linechar, * swordchar;
          linechar = new char [line.size()+1];
          strcpy(linechar, line.c_str());
          swordchar = new char [searchword.size()+1];
          strcpy(swordchar, searchword.c_str());
          
          
          
          //search through the line for the term and end or info and write info
          int linesize, index = 0, found, end;
          char read;
          
          linesize = line.size();
                
          for (int k=0; k<linesize; k++)
          {
              // search for the word using the char by char check and incrementing index
              if (linechar[k] == swordchar[index])
              {
                              index++;
              }
              else index = 0;
              
              //check to see if the word has been found
              if (index == searchword.size())
              {
                        found = index;
                        for (int r=found; r<linesize; r++)
                        {
                            if (linechar[r] == '<')
                            {
                                            end = r;
                                            break;
                            }
                        }
                        cout << "Price:";
                        for (int g=found; g<end; g++)
                        {
                            cout<< linechar[g];
                        }
                        cout << endl;
              }
          }
                
          
          
          
        }
        myfile.close();
      }
    
      else cout << "Error opening the file" << endl; 
    
    }

  2. #2
    Join Date
    Jan 2004
    Location
    Düsseldorf, Germany
    Posts
    2,401

    Re: problem with text parsing function

    Quote Originally Posted by hojoff79 View Post
    I am writing a program to search through some text for information. It looks through a line like this

    <div class="lvPrice">$50,000</div>

    and pulls out the information contained in that line (it transforms the whole line to a cstring, goes through and finds the "<div class="lvPrice">" part then goes through and finds the "<" immediately after this and then prints the cstring elements inbetween them.
    Why C-Strings? Use std::string::find and std::string::substr instead.
    More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity. --W.A.Wulf

    Premature optimization is the root of all evil --Donald E. Knuth


    Please read Information on posting before posting, especially the info on using [code] tags.

  3. #3
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,565

    Re: problem with text parsing function

    As treuss mentioned, it will be easier to sue std::string functions.

    If you stay with your current setup : you are leaking memory within
    the loop (two new's ... no delete's).

  4. #4
    Join Date
    Mar 2009
    Location
    Riga, Latvia
    Posts
    128

    Re: problem with text parsing function

    Do you know what is a finite state machine(automaton)? You could use it to parse a string
    Last edited by andrey_zh; June 29th, 2009 at 06:59 AM.

  5. #5
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,013

    Re: problem with text parsing function

    Quote Originally Posted by hojoff79 View Post
    I am writing a program to search through some text for information. It looks through a line like this

    <div class="lvPrice">$50,000</div>
    You might want to consider using an xml parser. There are some good, free C++ libraries available.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  6. #6
    Join Date
    Jun 2009
    Posts
    40

    Re: problem with text parsing function

    I have been using cstrings so that i can using the char[index] index find what character the > ends at and what index the < starts and make a string with everything inbetween (aka, the actual information). I would not know how to do this with string functions (aka, how to single out the information inbetween the <div class="lvPrice"> and the next <

  7. #7
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,565

    Re: problem with text parsing function

    As treuss mentioned, you would use substr to get the string
    between the ">" and the "<" ...

    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    
    using namespace std;
    
    // function to parse a file and add it to a new file
    void parseandwrite (const string & writefile, const string & searchword)
    {
        string line;
         
        ifstream myfile (writefile.c_str());
    
        if (myfile.is_open())
        {
            cout << "VIEW OF FILE " << writefile << endl;
            while ( getline (myfile,line) )
            {
                size_t pos1 = line.find(searchword);        // look for the search word
    
                if (pos1 != string::npos)                   // if found ...
                {
                    pos1 = line.find(">",pos1);             // find the >
    
                    if (pos1 != string::npos)               // if found ... 
                    {
                        size_t pos2 = line.find("<",pos1);  // find the <
    
                        if (pos2 != string::npos)           // if found ...
                        {
                            cout << searchword << " : " << line.substr(pos1+1,pos2-pos1-1) << "\n";
                        }
                    }
                }
            }
        }
        else
        {
            cout << "Error opening the file" << endl; 
        }
    }
    
    
    int main()
    {
        parseandwrite ("test.txt", "Price");
    
        return 0;
    }

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center