|
-
May 12th, 2010, 06:52 PM
#9
Re: Great Big Can O' Worms (UNICΘDЭ)
You've just opened another can 
The standard libraries assume that all file streams are a stream of char's, encoded based on the LC_CTYPE of the current locale. The "w" in wfstream just means the class interface uses wchar_t strings, which are of an implementation defined encoding (UTF16LE for Windows).
To use formatted IO on wide file streams, and have the files be wchar_t encoded as well - you have to open the file as binary, and replace the default codecvt facet. There's an example here: http://www.codeguru.com/forum/showpo...09&postcount=8
You also have to look out for the BOM at the beginning of the file, which tells you how the file is encoded: http://unicode.org/faq/utf_bom.html#bom1
More sample code for reading UTF8 files into UTF16LE, with or without BOM (all at once): http://www.codeguru.com/forum/showpo...18&postcount=5
gg
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|