-
June 23rd, 2013, 07:27 PM
#1
How to set low and high bytes of a wchar_t ?
I have a method of converting a wide char array to an unsigned char array. Now I need to be able to convert back (from unsigned char array to wide char array) but cannot figure out how to do that.
Here's my wc2uc method which uses built in macros (BTW, where do these come from? Are they unique to Windows API or are they part of C standard ?):
Code:
int wcstoucs(wchar_t wcs[], int nsz, unsigned char uc[] )
{
//uc = new unsigned char [ 2 * nsz + 1 ];
//memset(uc, 0x00, 2 * nsz + 1);
wchar_t wch = ' ';
int wdx = 0;
for(size_t i = 0; i < 2 * nsz; i+=2)
{
wch = wcs[wdx];
//pb[i] = LOBYTE(wch); // bigEndian
//pb[i+1] = HIBYTE(wch);
uc[i] = HIBYTE(wch); // littleEndian (x86)
uc[i+1] = LOBYTE(wch);
wdx++;
}
return 2 * nsz;
}// wcstoucs(wchar_t wcs[], int nsz, unsigned char uc[] )
And here's a method that only depends upon what I am certain is native 'C':
Code:
// given a wchar_t get the low and the hi order bytes
wchar_t wch;
byte lobyte, hibyte;
wch = 0xABCD;
lobyte = wch &0xff;
hibyte = wch >> 8;
printf("wch = %0.4X\n", wch); // ABCD
printf("lobyte =: %0.2X\n", lobyte); // CD
printf("hibyte =: %0.2X\n", hibyte); // AB
But we run into an lvalue problem if we try to inverse the operation:
Code:
wchar_t wch = 0x0000;
unsigned char ucb = 0x94;
// set the low byte
LOBYTE(wch) = ucb; // Error: Expression must be a modifiable lvalue
So how to set the hi and lo bytes of a wchar_t ?
Last edited by Mike Pliam; June 24th, 2013 at 01:28 AM.
mpliam
-
June 24th, 2013, 06:08 AM
#2
Re: How to set low and high bytes of a wchar_t ?
Code:
#include <wchar.h>
#include <stdio.h>
int main()
{
wchar_t wct;
unsigned char lb, ub;
lb = 0x17;
ub = 0x15;
wct = (ub << 8) + lb;
printf("0x%04x", wct);
return 0;
}
This prints 0x1517. I hope this is what you wanted.
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.6.5)
-
June 24th, 2013, 12:12 PM
#3
Re: How to set low and high bytes of a wchar_t ?
This prints 0x1517. I hope this is what you wanted.
Exactly! Thanks very much.
mpliam
-
June 24th, 2013, 02:47 PM
#4
Re: How to set low and high bytes of a wchar_t ?
Originally Posted by Mike Pliam
Exactly! Thanks very much.
Are you trying to re-invent MultiByteToWideChar function?
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
June 24th, 2013, 05:20 PM
#5
Re: How to set low and high bytes of a wchar_t ?
>> And here's a method that only depends upon what I am certain is native 'C'
The size of wchar_t is implementation defined. On most *nix's it is 4 bytes as UTF32 in native byte-order. On all Windows platforms it's 2 bytes as UTF16LE.
>> wcstoucs
That's name confused me because UCS is an encoding
gg
-
June 25th, 2013, 06:53 AM
#6
Re: How to set low and high bytes of a wchar_t ?
Originally Posted by VladimirF
doesn't loook like it, since he's not converting anything codepage wise...
it looks more like he's trying to reinvent a typecast from a wchar_t* to char*, but doing it by copying rather than casting the buffer.
-
June 25th, 2013, 07:00 AM
#7
Re: How to set low and high bytes of a wchar_t ?
Originally Posted by Codeplug
On all Windows platforms it's 2 bytes as UTF16LE.
This isn't correct.
On Windows NT it's UCS2
On Windows XP it's technically UTF16 but none of the fonts support a codepoint above 0xFFFF
Win95/98/ME has only partial support for wide character API's which was then also UCS2.
When dealing with networks/file systems UTF16 (even UCS2) has been flaky in the past, it wasn't until recently that a lot of the issues got cleaned up.
-
June 25th, 2013, 09:05 AM
#8
Re: How to set low and high bytes of a wchar_t ?
>> This isn't correct.
I'm sure readers can google if they are interested in the level-of-support for surrogate pairs in each historical platform - and even then support varies among the modules/apps in each platform:
http://www.i18nguy.com/surrogates.html
http://msdn.microsoft.com/en-us/goglobal/bb688099.aspx
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
In the end, it's just easier to say "Windows is UTF16LE".
gg
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|