|
-
January 29th, 2009, 02:54 PM
#1
ANSI extended to unicode (for HTML)
How can an extended ANSI character (byte with code > 128) be converted to its unicode value in the current code page?
For example, I'm writing code to output HTML text in UTF-8 encoding. My simple ANSI string contains a code for character 228 (which is a "ä"). When I write the string as HTML UTF-8 the character code 228 must be written as "ä" where the number between the "#" and ";" is the Unicode code.
In this particular case I can simply replace the character with value 228 with the string "ä" and it will work. But that seems to be improper and relies on the coincidence that extended ANSI character 228 corresponds to Unicode code 228 in the Swedish code page.
I think what I need to do is to take any extended ANSI code over 128 and look up its unicode value in current code page for the proper conversion, but I cannot figure out how to do that look-up.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|