Windows SDK String: How to convert between ANSI and UNICODE strings?
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 1 of 1

Thread: Windows SDK String: How to convert between ANSI and UNICODE strings?

Hybrid View

  1. #1
    Join Date
    Jun 2001
    Location
    Switzerland
    Posts
    4,443

    Windows SDK String: How to convert between ANSI and UNICODE strings?

    Q: How to convert between ANSI and UNICODE strings?

    A:

    The quick and dirty way

    This way of working is correct for codepages that are single-byte and Unicode strings that are UCS2. This applies to most cases, but if your program should run correctly on Japanese, Chinese, Taiwanese and other systems which have DBCS codepages then use the "correct way" described further below.

    ANSI to UNICODE:

    The conversion is done using the 'MultiByteToWideChar()' function:

    Code:
    char *ansistr = "Hello";
    int a = lstrlenA(ansistr);
    BSTR unicodestr = SysAllocStringLen(NULL, a);
    ::MultiByteToWideChar(CP_ACP, 0, ansistr, a, unicodestr, a);
    //... when done, free the BSTR
    ::SysFreeString(unicodestr);
    UNICODE to ANSI:

    The UNICODE string mostly is returned by some COM function, like this one:

    Code:
    HRESULT SomeCOMFunction(BSTR *bstr)
    {
       *bstr = ::SysAllocString(L"Hello");
       return S_OK;
    }
    The conversion is done using the 'WideCharToMultiByte()' function:

    Code:
    BSTR unicodestr = 0;
    SomeCOMFunction(&unicodestr);
    int a = SysStringLen(unicodestr)+1;
    char *ansistr = new char[a];
    ::WideCharToMultiByte(CP_ACP, 
                            0, 
                            unicodestr, 
                            -1, 
                            ansistr, 
                            a, 
                            NULL, 
                            NULL);
    //...use the strings, then free their memory:
    delete[] ansistr;
    ::SysFreeString(unicodestr);
    The correct way

    If you want to handle DBCS codepages and UTF-16 Unicode strings then you should do things this way. The idea is to call 'MultiByteToWideChar()' resp. 'WideCharToMultiByte()' twice. First you get the length of the result, then you allocate the resulting string and call it again to convert.

    ANSI to Unicode

    Code:
    char *ansistr = "Hello"
    int lenA = lstrlenA(ansistr);
    int lenW;
    BSTR unicodestr;
    
    lenW = ::MultiByteToWideChar(CP_ACP, 0, ansistr, lenA, 0, 0);
    if (lenW > 0)
    {
      // Check whether conversion was successful
      unicodestr = ::SysAllocStringLen(0, lenW);
      ::MultiByteToWideChar(CP_ACP, 0, ansistr, lenA, unicodestr, lenW);
    }
    else
    {
      // handle the error
    }
    
    // when done, free the BSTR
    ::SysFreeString(unicodestr);
    Unicode to ANSI

    Code:
    BSTR unicodestr = 0;
    char *ansistr;
    SomeCOMFunction(&unicodestr);
    int lenW = ::SysStringLen(unicodestr);
    int lenA = ::WideCharToMultiByte(CP_ACP, 0, unicodestr, lenW, 0, 0, NULL, NULL);
    if (lenA > 0)
    {
      ansistr = new char[lenA + 1]; // allocate a final null terminator as well
    ::WideCharToMultiByte(CP_ACP, 0, unicodestr, lenW, ansistr, lenA, NULL, NULL);
      ansistr[lenA] = 0; // Set the null terminator yourself
    }
    else
    {
      // handle the error
    }
    
    //...use the strings, then free their memory:
    delete[] ansistr;
    ::SysFreeString(unicodestr);

    Last edited by Andreas Masur; July 24th, 2005 at 06:15 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center