CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 5 of 5
  1. #1
    Join Date
    Feb 2009
    Location
    Portland, OR
    Posts
    1,488

    How to tell if a unicode TCHAR can be converted to ANSI char?

    At this point I'm using a call to WideCharToMultiByte(CP_ACP, WC_DEFAULTCHAR, ..., &UsedDefaultChar) and then check if bUsedDefaultChar was set to TRUE, which would mean that the conversion is not possible, but that method seems to be too much of an overkill for just one TCHAR. Can someone suggest a better way to do it?

  2. #2
    Join Date
    Mar 2003
    Location
    India {Mumbai};
    Posts
    3,871

    Re: How to tell if a unicode TCHAR can be converted to ANSI char?

    I may be wrong, but you can check if TCHAR variable is < 127, which means it's ANSI.
    My latest article: Explicating the new C++ standard (C++0x)

    Do rate the posts you find useful.

  3. #3
    Join Date
    Feb 2009
    Location
    Portland, OR
    Posts
    1,488

    Re: How to tell if a unicode TCHAR can be converted to ANSI char?

    Thanks, but that is not that simple. Some non-English letters may be converted into an ANSI charset if it is installed by default on the system. Unfortunately I can't test it myself on this PC

    Well, if there's no other way than calling WideCharToMultiByte, is the following acceptable?

    Code:
    BOOL CheckAcceptableChar(TCHAR ch)
    {
        BOOL bDefaultUsed = TRUE;
        VERIFY(WideCharToMultiByte(CP_ACP, 0, &ch, 1, NULL, 0, NULL, &bDefaultUsed));
        return !bDefaultUsed;
    }

  4. #4
    Join Date
    Feb 2009
    Location
    Portland, OR
    Posts
    1,488

    Re: How to tell if a unicode TCHAR can be converted to ANSI char?

    Folks, anyone?

  5. #5
    Join Date
    Nov 2003
    Posts
    1,902

    Re: How to tell if a unicode TCHAR can be converted to ANSI char?

    Why do you want to do this? And why one character at a time?

    Do you care if the conversion is "round-trip 'able"? (ie use WC_NO_BEST_FIT_CHARS)?

    Passing a WCHAR (wchar_t) instead of TCHAR would make more sense. The problem here is that a single WCHAR isn't always a single "character". Two examples: 1) Unicode characters outside the BMP (0x0000 - 0xFFFF) require 2 WCHAR's to represent 1 character. 2) "Decomposed" Unicode characters. For example, "0x0041 + 0x0308" = "capital A + dieresis", or Ä. When "precomposed", Ä = 0x00C4.

    So if you *know* that all Unicode characters are "precomposed", and you will never have Unicode characters outside the BMP - then you could pass in a single WCHAR, representing a single Unicode character. But why one "character" at a time?
    Code:
    #include <windows.h>
    #include <iostream>
    #include <iomanip>
    using namespace std;
    
    BOOL CP_Convertable(const WCHAR *p, UINT len, UINT cp = CP_ACP, BOOL bNoBestFit = TRUE)
    {
        // Ajay Vijay optimization :)
        //  all codepages and Unicode are the same below 127
        if ((len == 1) && (*p < 127))
            return TRUE;
    
        BOOL bDefaultUsed = TRUE;
        DWORD flags = 0;
        
        if (bNoBestFit)
            flags = WC_NO_BEST_FIT_CHARS;
    
        int rc;
        for (;;)
        {
            rc = WideCharToMultiByte(cp, flags, p, len, 0, 0, 0, &bDefaultUsed);
            if (!rc && flags && (GetLastError() == ERROR_INVALID_FLAGS))
            {
                // flags may not be valid for given codepage
                flags = 0;
                continue;
            }//if
            
            break;
        }//for
    
        return rc && !bDefaultUsed;
    }//CP_Convertable
    
    void test_wchar(WCHAR c, UINT cp)
    {
        char old_fill = cout.fill('0');
        if (CP_Convertable(&c, 1, cp))
        {
            cout << "0x" << setw(4) << hex << c << dec
                 << " is convertible to CP " << cp << endl;
        }//if
        else
        {
            cout << "0x" << setw(4) << hex << c << dec
                 << " is NOT convertible to CP " << cp << endl;
        }//else
        cout.fill(old_fill);
    }//test_wchar
    
    int main()
    {
        UINT cp = 1250; // test with 1250
        WCHAR w_good = 0x0107; // LATIN SMALL LETTER C WITH ACUTE, 0xE6 in cp1250
        WCHAR w_bad = 0xFF99; // HALFWIDTH KATAKANA LETTER RU, not in cp1250
    
        test_wchar(w_good, cp);
        test_wchar(w_bad, cp);
    
        return 0;
    }//main

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured