CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 5 of 5

Thread: Reading a CSV

  1. #1
    Join Date
    Dec 2000
    Posts
    129

    Reading a CSV

    I have a CSV file that has a "data", "more data" format. I have found how to use getline to read it in. Which works fine, and I just remove the "s. The problem is that I have found that the lines has commas in the data, ie "data1, data2", "more data" so it causing a problem when I read it in.

    How can I read the data in better? Or have it read the lines by "", ?

    Thanks

  2. #2
    Join Date
    Nov 2003
    Location
    Belgium
    Posts
    8,150

    Re: Reading a CSV

    Several solutions:
    • Search the internet to find and use an already written CSV class reader.
    • Split each line before removing the " as follows: find 2 consecutive " characters and extract all data between the 2 " and store it in an array of some sort. Pay attention to the way a " character is encoded when it appears inside the data itself, probably it is escaped like \" or something.
    Marc Gregoire - NuonSoft (http://www.nuonsoft.com)
    My Blog
    Wallpaper Cycler 3.5.0.97

    Author of Professional C++, 4th Edition by Wiley/Wrox (includes C++17 features)
    ISBN: 978-1-119-42130-6
    [ http://www.facebook.com/professionalcpp ]

  3. #3
    Join Date
    Dec 2000
    Posts
    129

    Re: Reading a CSV

    Found that boost has a solution. Here is the info incase anyone else runs into it. I am just giving the jist of it.

    Code:
    #include <boost/tokenizer.hpp>
    ...
    while (getline(in,line,'\n')) {  //Read the file line by line
    	boost::tokenizer<boost::escaped_list_separator<char> > tok(line);
    	for(boost::tokenizer<boost::escaped_list_separator<char> >::iterator beg=tok.begin(); beg!=tok.end();++beg){
    		cout << *beg;
    	}
    }

  4. #4
    Join Date
    Oct 2000
    Location
    London, England
    Posts
    4,773

    Re: Reading a CSV

    The simplest way (if you write your own) is to load a line at a time from the file.

    Then read the line a character at a time.

    If you encounter a quotes character " then your mode toggles.

    When you are reading in normal mode then a , indicates a new token. When you are reading in literal mode (having encountered a quote character) then the comma is part of the token.

    Not sure if CSV allows quotes to be part of the actual text. In parsing text that does, this would normally be achieved by an escape sequence. An escape sequence is generally 2 characters. (Note that in XML escape sequences begin with a & character and end with a semi-colon. HTML uses the same sequences).

    You should be prepared for potential parse errors.

  5. #5
    Join Date
    Dec 2000
    Posts
    129

    Re: Reading a CSV

    What I posted works 100%

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured