problem in reading a csv file
Hi alll,
I have a problem in reading a csv file. My probelm is when i read a comma seperated text file into a csv file,
i got exta columns as there are commas between text. for example my text file is like this..........
/........................../
philipsco.ltd,usa,james
k.w.p. philipines,co.ltd, new jersey, Rosy
Samsung co,ltd, japan, rekhamithal
/....................../
when i read this file, comma betweem philipines,co.ltd is divided into two columns which has to be in one column.Pls help me.......i need it urgently
Re: problem in reading a csv file
Using my Crystal Ball I can tell that you are doing something wrong.
Seriously, go back ad re-read your post. Remember we have NO other kowledge about what you are doing. HOW do you expect any of us to guess???
Re: problem in reading a csv file
hi cpuwiz,
i'm guessing that what krish meant was something like split the following string:
k.w.p. philipines,co.ltd, new jersey, Rosy
into this:
k.w.p. philipines,co.ltd
new jersey
Rosy
but sinces it's a CSV, it's splitting it into this:
k.w.p. philipines
co.ltd
new jersey
Rosy
i've been wrong before, though....
cheers!
Re: problem in reading a csv file
Your CSV Format isn't correct. in CSV files, text is usually stuffed between " marks. This prevents comma's in there from messing up your seperations. And " is Escaped by a 2nd ". Usually this is how CSV files are built up.
So your code has to check if a value starts with a ". If that's the case, it's a text, and should keep reading until it hits another ". Then check if there's another " after it, in which case, one is dropped and you're still in text reading mode, so you go to the next ".
When you find one that doesn't have a " after it, your text is completed and you can expect a , to switch to the next field.
There are many variants on this idea, using ; instead of , to seperate values, or . or tabs, etc. Sometimes the first row will contain the column names, sometimes it's raw data. And ofcourse, back in the days that CSV was invented, object oriented coding and such was practicly non-existant, so you had to rewrite the whole thing again from scratch if you had to do another kind of import engine. ( read: copy-paste-coding )
Re: problem in reading a csv file
however, there is something wrong with the input CSV file. AFAIK, text columns must be surrounded by double-quotes so that there would no confusions with commas..
"philipsco.ltd", "usa", "james"
"k.w.p. philipines,co.ltd", "new jersey", "Rosy"
"Samsung co,ltd", "japan", "rekhamithal"
Re: problem in reading a csv file
Thread1 is correct. According to the CSV specification, if there are commas within text then it should be surrounded by double quotes.
This code should correctly parse any line within a CSV file:
Code:
string[] Lines = Regex.Split(Line, ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");