-
February 2nd, 2009, 09:10 AM
#1
Unsigned char to wstring
Windows gives me an UCHAR array (UCHAR Chemistry[4]) which I want to convert to a wstring.
A while back I recall reading a post by Paul McKenzie on how to convert from string to wstring.
So I tried the following:
PHP Code:
std::wstring CBattery::GetChemistry()
{
std::string chemis(reinterpret_cast<char*>(_batInfo.Chemistry));
_chemistry.assign(chemis.length(), L' ');
std::copy(chemis.begin(), chemis.end(), _chemistry.begin());
return _chemistry;
}
also
PHP Code:
std::wstring CBattery::GetChemistry()
{
std::stringstream ss;
ss << _batInfo.Chemistry;
std::string chemis = ss.str();
_chemistry.assign(chemis.length(), L' ');
std::copy(chemis.begin(), chemis.end(), _chemistry.begin());
return _chemistry;
}
Neither of these work correctly. When I use the wstring it shows the 4 characters correctly, but also includes some garbage.
Unfortunately I can't debug the program since it calls battery related API functions and I don't have debugging tools installed on the laptop available to me.
Can anyone help me in going from unsigned char to wstring?
Edit: I again looked at the MSDN page for the BATTERY_INFORMATION structure. With regards to the Chemistry member they say "This string is not necessarily zero-terminated". This is probably my problem. How do I handle this? Do I need to create a new array with an extra space for the null?
Last edited by links; February 2nd, 2009 at 09:18 AM.
-
February 2nd, 2009, 09:40 AM
#2
Re: Unsigned char to wstring
What you need is a function that transcodes from UTF-8 to UTF-16LE. (ASCII is a subset of UTF-8.)
That's something you certainly could write yourself if you wanted; it's not very hard. However, you might try this first:
http://msdn.microsoft.com/en-us/libr...cy(VS.80).aspx
Note that you won't be able to write to a wstring directly, but you could use a vector<wchar_t> if you wanted.
-
February 2nd, 2009, 09:43 AM
#3
Re: Unsigned char to wstring
Of course you can test. You create a simple console application with only that code. You provide the input hard-coded (instead of taking it from Windows) and you debug through it.
If you want to convert from ASCII to UNICODE I suggest you read this FAQ: http://www.codeguru.com/forum/showthread.php?t=231165.
-
February 2nd, 2009, 10:04 AM
#4
Re: Unsigned char to wstring
The problem is that BATTERY_INFORMATION::Chemistry is not a null-terminated C-string. You don't have to worry about encoding since it should only contain basic ascii characters.
Code:
_chemistry.clear();
UCHAR *p = _batInfo.Chemistry,
*p_end = p + 4;
for (; p < p_end && *p; ++p)
_chemistry += (wchar_t)*p;
gg
-
February 2nd, 2009, 01:39 PM
#5
Re: Unsigned char to wstring
Thanks for all the replies guys.
Before I saw Codeplug's reply I came up with the following and it seems to work.
Code:
std::wstring CBattery::GetChemistry()
{
char temp[5] = {0};
memcpy(&temp[0], &_batInfo.Chemistry[0], sizeof(_batInfo.Chemistry)/sizeof(UCHAR));
std::string chemis = temp;
_chemistry.assign(chemis.length(), L' ');
std::copy(chemis.begin(), chemis.end(), _chemistry.begin());
return _chemistry;
}
Later on I output _chemistry in a ListBox control using wstr.c_str()
Before getting to the whole ASCII, Unicode and code pages thing, I want to know if the above code is okay (e.g safe) given that Chemistry[4] will always be ASCII? It works for me now, but I'm afraid that I could be missing something or doing something improper that can lead to undefined behaviour.
Now the whole deal with Unicode and code pages isn't very clear to me, so I want to ask a few questions / make a few statements to better my understanding.
Are characters stored in wchar_t (and by definition std::wstring as well) always seen as Unicode (UTF-16) on Windows?
If I understand correctly, the reason you cannot blindly convert from a char string to a wchar_t string is because of different code pages. For instance 456 can point to one type of character in one code page and a completely different one in another. Is my understanding correct?
And the whole code page issue is the reason for functions such as MultiByteToWideChar(), right?
-
February 2nd, 2009, 02:15 PM
#6
Re: Unsigned char to wstring
>> I want to know if the above code is okay
Yes, you are providing the null-terminator now since temp[4] == 0 and you're copying 4 bytes into temp[0 to 3].
>> ... as Unicode (UTF-16) on Windows?
Yes. wchar_t is a UTF16LE encoded code point in windows. I say "code point" instead of "character" because it may take up to two wchar_t's to represent a single Unicode character.
>> the reason you cannot blindly convert from a char string to a wchar_t string is because of different code pages.
Correct. But for ASCII characters up to 0x7F (127), all code pages and UTF encodings use the same value/character-glyph mapping. So you shouldn't have to worry about this for BATTERY_INFORMATION::Chemistry characters.
>> And the whole code page issue is the reason for functions such as MultiByteToWideChar()
Yeah. It will take you from "code page" to UTF16LE. It can also be used to decode UTF8 to UTF16LE.
gg
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|