Click to See Complete Forum and Search --> : Reading contents of a document file


ganeshsheshadri
January 11th, 2002, 01:16 PM
Hai,

I have the following code (in C++) to open and read a word document(.doc file). But, the getLine method always returns junk characters. Why does this happen. The same code works for .txt files. Is there something else that I need to do when reading the MS Word files?

char buf[80];
ifstream stream;
stream.open("C:\\test1.doc");
while(stream.getline(buf,80) != NULL)
{
MessageBoxNULL,buf,"",MB_OK);
}

Thanks in advance,
GPS

hplmuc
January 11th, 2002, 01:30 PM
Well, the main problem is, that a MS-Word.DOC file does not only has those textual characters in it that you would expect to see in a plain txt file. Those other characters are mainly those, which make the difference between MS-Word an the Notepad. All those informations like used fonts, size of the fonts, dot-files, formatted characters etc (and many more) are inclueded in e v e r y Word.doc as well (not to talk about macros). So there will be no way to do it like you did.

(Just as a prove: Try to open your test1.doc with notepad.exe or even better with the good old debug.com (in case you know how to handle that programm)) and you will see your junk characters again.

bye

hplmuc