November 20th, 2012, 01:29 PM
Yet another Unicode v Ansi question
Working in Win32 console app (VS 2010) I have been trying to convert several Unicode (UTF-16) C++ functions to Ansi C (UTF-8). The test app includes two tokenizer classes, each of which work perfectly well in their respective environments, CTokA and CTokW (UTF-8 and UTF-16).
A problem arises when I attempt to run the UTF-8 functions when the Character Set properties is set to 'Use Unicode Character Set' in that std::string manipulations do not perform as expected, e.g.,
gets reproduced as
Attempting to null terminate the string where it is supposed to end simply results in a space in that position and the garbage end persists, e.g.,
If I attempt to change the Character Set property to 'Use Multibyte Character Set' or 'Not Set', the app will not compile and hundreds of errors occur. Of course, I can eliminate all of the UTF-16 code, but it strikes me that it should not be necessary. Perhaps if M$ made everything UTF-16 without all of the necessary decorations like 'L' and '_T(', life would be much simpler. Unfortunately, I have a very extensive UTF-8 app under 10 years of development that works quite well, but my UTF-16 (Unicode) conversion doesnt work as well because of the mixing of pointers (I think), so I have had to revert much of the code back to UTF-8. (All of which has nothing to do with my question but is simply psychotheraputic for me to ventilate on.)
sline = 0x0000;
My question is this: Can UTF-8 and UTF-16 code coexist in a single Win32 console app?
Last edited by Mike Pliam; November 20th, 2012 at 01:42 PM.
Click Here to Expand Forum to Full Width
This is a Codeguru.com survey!