|
-
September 16th, 2003, 04:11 AM
#1
CString
Hi all
CString aStr="¤¤¤å"; //receive a Chinese sentence
int aLength=aStr.Length(); //it returns 4, why not 2
I need to work on strings character by character, so i assign each word to a "UINT" data type
for(int i=0;i<=aLenght;i++)
{
UINT a = aStr.GetAt(i); // can not assign a Chinese Char to UINT
...................
}
how can i get a chinese character from a CString?
Last edited by wow9999; September 16th, 2003 at 04:15 AM.
-
September 16th, 2003, 04:15 AM
#2
Make your program UNICODE.
CString characters are TCHAR which are defined as char or wchar_t deppending if your program is UNICODE or not.
You need wchar_t strings because chinese characters are 16 bit long.
-
September 16th, 2003, 04:18 AM
#3
Hi
Hi,
Thats due to the DOUBLE BYTE situation with kanji characters.
For example if you have a string that says, HELLO , in MBCS
( MultiByte Character Set), its represented like this.
H \0 E \0 L \0 L \0 O \0
Note that after each letter there is a SLASH ZERO. This is
because kanji characters have 2 bytes for each character. They
are the "Leading Character" and the "Trailing Character".
You have to be very careful while using them. Search for
MBCS in the MSDN or as you are using Chinese, i'd suggest
www.csdn.net. Hope this helps.
-
September 16th, 2003, 04:19 AM
#4
Thanks for your reply.
my program is not UNICODE.
Do i have to change my program to UNICODE?
Thanks
-
September 16th, 2003, 04:22 AM
#5
I don't quite follow your problem. Your example code clearly shows 4 character bytes. Perhaps your problem is to do with the fact that Chinese character set is in UNICODE and not ASCII. This means that 2 bytes are allocated for each character. The GetLength() method of the CString class returns the number of BYTES not the number of characters in the string.
To get around this place _UNICODE in your build settings, that way the CString class will be set to count 16bit characters (UNICODE) rather than 8 bit ones.
-
September 16th, 2003, 04:30 AM
#6
Since your program is not unicode the return value is correct. MBCS characters are counted as two characters, note that purely chinese characters do not contains null characters.
if you make your program UNICODE the return value of GetLenght() will be 2. MSDN is confuse in this sense.
Deppends on what are you going to do with your chinese strings, a good choice is to make your program unicode. Think if your really need to work with CString, However, UNICODE implies that your program 'will not' work under W9x systems.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|