|
-
December 31st, 2008, 11:17 PM
#17
Re: how to get length of UTF8 encoded string
 Originally Posted by Codeplug
>> [b]Most of post #11 was about your claim of wchar_t being "portable". There are NO portable guarantees for wchar_t as it relates to character sets and encodings - except that a wchar_t can represent any char. As an integer type, it is as-portable-as int. The sizeof both are implementation defined.
>> 1) ... roundtrippable
And wchar_t does not provide for this "across all system boundaries", as stated earlier, since system A may use one encoding/character set while system B uses something entirely different to represent wchar_t's. The only Unicode encoding that does provide for this is UTF8 using char's since endianess comes into play for the other UTF's.
I think we are saying the same thing from two different points of view...
As soon as an application starts to look at the content, things change just like Schrödinger's cat, or Heisenberg's uncertainty principle. As soon as you start talking about the meaning of the encoded information every thing does become implementation dependant.
Consider the following sequence.
a) A files exists with a character encoding of "X"
b) This file is read and processed by an application which supports encoding 'x'.
c) A new file is written with encoding 'X'
d) A different application on a different platform with a different sizeof(wchar_t), that ALSO supports encoding "X" reads and processes the file.
the internal byte representations on the two applications may be totally different. but the usage of wchar_t as the internall processing mechanism will not destroy the portability of the information.
Because of this it is critical to make use the the proper encoding classes when manipulating the data, and not every application or platform will support every encoding.
But the act of using wchar_t per se, does NOT mean that the application is non-portable. What you DO while the information that is stored in the wchar_t based variables is a completely different story.
TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
2008, 2009,2010
In theory, there is no difference between theory and practice; in practice there is.
* Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions 
* How NOT to post a question here
* Of course you read this carefully before you posted
* Need homework help? Read this first
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|