October 26th, 2006, 08:24 AM
Visual C++ General: How to use different character sets?
Q: I have this simple function call:
The compiler raises the following error and I don't understand why.
MessageBox(NULL, "Test message", "Title", MB_OK);
A: Simply answered, that happens because the project is built for UNICODE.
error C2664: 'MessageBoxW' : cannot convert parameter 2 from 'const char ' to 'LPCWSTR'
Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast
Microsoft run-time library provides Microsoft-specific generic-text mappings for many data types, routines and other objects, mappings that are defined in TCHAR.h. There are three supported character sets:
- ASCII (single-byte character set – SBCS)
- MBCS (multi-byte character set)
The use of one or another character set is controlled by two pre-processor directives:
- _UNICODE: if defined, Unicode is the character set used
- _MBCS: if defined, MBCS is used
- If neither of the above (mutually-exclusive) is defined, ASCII is the character set used
The Windows API provides different version of each function for Unicode and ASCII.
Q: How do I select the character set?
A: You have to go to Project Properties > Configuration Properties > General and change the value of the Character Set property. The three available options are:
- Not Set (neither _UNICODE nor _MBCS are defined)
- Use Multi-byte Character Set (_MBCS is defined)
- Use Unicode Character Set (_UNICODE is defined)
Q: How exactly do the generic-text mapping directives affect the data types and functions that I'm using?
A: C run-time library functions, such as _itot, or Windows API functions, such are MessageBox, aren't functions at all; they are macros.
The C run-time library provides functions for all character sets and a macro to define one or another of these functions depending on the used character set. For instance macro _itot resolves to:
- _itoa, when _UNICODE is not defined
- _itow, when _UNICODE is defined
Similarly, TCHAR resolves:
- char, when _UNICODE is not defined
- wchar_t, when _UNICODE is defined
You can read more about the mappings in MSDN.
On the other hand, the Windows API comes in two versions: for Unicode and for ASCII/Multi-byte. If you read the MSDN page for MessageBox it says:
Actually, MessageBox and LPCTSTR are both macros. You can see how MessageBox it's defined in WinUser.h:
The MessageBox function creates, displays, and operates a message box. The message box contains an application-defined message and title, plus any combination of predefined icons and push buttons.
There are two version of the function, actually: MessageBoxA for ASCII & MBCS and MessageBoxW for Unicode. When UNICODE (which is the same with _UNICODE) is defined then MessageBox resolves to MessageBoxW and LPCTSTR to LPCWSTR (i.e. const whar_t*); otherwise MessageBox resolves to MessageBoxA and LPCTSTR to LPCSTR (i.e. const char*).
__in_opt HWND hWnd,
__in_opt LPCSTR lpText,
__in_opt LPCSTR lpCaption,
__in UINT uType);
__in_opt HWND hWnd,
__in_opt LPCWSTR lpText,
__in_opt LPCWSTR lpCaption,
__in UINT uType);
#define MessageBox MessageBoxW
#define MessageBox MessageBoxA
#endif // !UNICODE
Q: How do I write my program so that it builds for any of these character sets without modifying the code when the character set changes?
A: In a single-byte or multi-byte character set the strings and characters are not prefixed my anything ('string', 'c'). However, for Unicode strings and characters required the suffix L, such as L"string" and L'c'. You can use the Microsoft-specific macros _T() or _TEXT(). These macros are removed by the pre-processor when _UNICODE is not defined, and replaced with L when _UNICODE is defined.
- no: _T("string") becomes "string" and _T('c') becomes 'c'
- yes: _T("string") becomes L"string" and _T('c') becomes L'c'
Q: How do I fix the mention line of code?
A: It should be clear now:
MessageBox(NULL, _T("Test message"), _T("Title"), MB_OK);
Last edited by cilu; July 30th, 2007 at 04:12 AM.
Click Here to Expand Forum to Full Width