CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6
  1. #1
    Join Date
    May 2008
    Posts
    18

    c++ - parsing windows textfile: how to strip extra characters?

    i've run into a problem reading a windows-generated textfile
    onto my linux (mandriva 2007.1) system using c++. it took me
    a long time and lots of help from the good folks here, but
    i've finally figured out that the issue i've run into seems
    to be one of extra, hidden characters in the original text file.

    i started out by processing one variable read from the textfile
    and had a lot of problems. i finally got around them by using
    substr to parse only the first three characters of the line
    read into my variable. that made things work.

    my thinking is, however, that the likelihood is that every line
    in the text file probably has this same issue. that would argue
    in favor of addressing the issue, not at the individual variable
    level, but at the file level. in other words, when the text file
    is first parsed into my script. either that or by somehow
    processing the textfile before it is read.

    so, there i have two ideas to pursue: preprocessing the text
    file, or processing it as it is read.

    i'm using vector to read the text file. how would i strip extra
    characters at that stage?

    alternately, how would i strip the extra characters before the
    text file comes into the script?

    the program is below.

    thanks,
    BabaG
    Code:
    #include <fstream>
    #include <iostream>
    #include <iomanip>
    #include <string>
    #include <vector>
    #include <assert.h>
    
    using namespace std;
    
    int main()
    {
       int count = 0
    
       ifstream infile("file_to_be_parsed.txt");
    
       if (!infile)
       {
          cerr << "Could not open file." << endl;
    
          return 1;
       }
    
       vector<string> ScriptVariables;
       string line;
    
       while (getline(infile, line))
       {
          ScriptVariables.push_back(line);
       }
    
       infile.close();
    
    // lots of variables assigned from text file
    // this is the one that's been a problem in another thread
    
       string capformat = ScriptVariables[8]; 
    
    // perform operations
    
       int cr2W = 4368; 
       int cr2H = 2912; 
    
       int nefW = 3872; 
       int nefH = 2592; 
    
       double CtrX = 0; 
       double CtrY = 0; 
    
       string capformatTrimmed = capformat.substr(0,3);
    
       if (capformatTrimmed == "cr2")
          {
          double CtrX = cr2W/2.0; 
          double CtrY = cr2H/2.0;
          } 
       else if (capformatTrimmed == "nef")
          {
          double CtrX = nefW/2.0; 
          double CtrY = nefH/2.0;
          } 
       else
          {
             cout << "something is wrong with cr2/nef line." << endl;
          }
    
       cout << CtrX << endl; 
       cout << CtrY << endl;
    
       return 0;
    }

  2. #2
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: c++ - parsing windows textfile: how to strip extra characters?

    Windows uses \r\n for line endings; Unix uses just \n. This is a well-known problem, and the reason why I try never to open files in text mode on Windows----I'd prefer that it just write what I tell it to, and not try to insert extra \r characters all over the place.

    If you only have one word per line, the simplest thing would be to use operator>> rather than getline. Otherwise, you'll have to do the whitespace stripping yourself.

  3. #3
    Join Date
    May 2008
    Posts
    96

    Re: c++ - parsing windows textfile: how to strip extra characters?

    In my experience, cross-platform programs that cannot handle the difference between Win and Unix line endings (at bare minimum) always break when least desired.

    It gets me at home often enough that I wrote myself a little utility years ago that I'm still using that does the same as dos2unix, so my dumb windows programs can handle unix text files.

    You might just want to get lines with this little example
    http://www.codeguru.com/forum/showpo...0&postcount=11

    It handles LF (unix), CRLF (windows), and CR (mac). Enjoy!

  4. #4
    Join Date
    Aug 2005
    Location
    San Diego, CA
    Posts
    1,054

    Re: c++ - parsing windows textfile: how to strip extra characters?

    Quote Originally Posted by Duoas
    In my experience, cross-platform programs that cannot handle the difference between Win and Unix line endings (at bare minimum) always break when least desired.

    It gets me at home often enough that I wrote myself a little utility years ago that I'm still using that does the same as dos2unix, so my dumb windows programs can handle unix text files.

    You might just want to get lines with this little example
    http://www.codeguru.com/forum/showpo...0&postcount=11

    It handles LF (unix), CRLF (windows), and CR (mac). Enjoy!
    I find it hard to believe that you are trying to convince us that your switch example is not spaghetti code!

  5. #5
    Join Date
    Aug 2005
    Location
    San Diego, CA
    Posts
    1,054

    Re: c++ - parsing windows textfile: how to strip extra characters?

    When you store the text file on the linux computer there should be a built in shell command to convert it for you.

    Are you doing this as an exercise or do you need to actually worry about making the program portable enough to read both types of files? On windows, the getline operation gives you a null terminated string without the \r\n so I have no clue what getline would do on a linux system if it encountered a \r\n. You'd think it would just see the \r as a substring and just read it since it doesn't have any special meaning to a linux computer. If it were me, I'd step into the code with a debugger and see what string is in memory after a getline operation. If the "\r" is there you should be able to use find and erase to get rid of it after each getline. I don't know linux so if I really wanted to write a program to do this, I'd just try it and see what happens in the debugger first. If it were me, I would convert the file from dos to unix using a shell command before running my C++ program.

  6. #6
    Join Date
    May 2008
    Posts
    96

    Re: c++ - parsing windows textfile: how to strip extra characters?

    No you twit, that's elegance.

    "A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away."

    If you think it is spaghetti code then you've never actually seen spaghetti code, nor do you have an appreciation for its actual drawbacks and lack of structure.

    I never said you had to like it, and I've made no attempt to force anyone to use it. Examples of useful goto are so few and far-between it just so happened that I had it handy.

    If you don't like it, don't read my posts. But keep your religious bigotry out of other people's threads. Either offer a better solution, or shut up.

    [edit] It occurs to me that you weren't being a jerk, but just joshing me. If you were, I'm sorry. I guess I'm a bit defensive... :-S
    Last edited by Duoas; May 19th, 2008 at 06:00 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured