CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6
  1. #1
    Join Date
    Mar 2003
    Location
    NY-NJ
    Posts
    37

    Extended ASCII character set

    Folks,

    Does anybody know how to figure out whether an extended ASCII character is a good printable character for a localized language?

    In other words, for standard ASCII char set, I could call

    setlocale (LC_CTYPE, "language_of_my_choice");

    int ac;

    // init stuff to do here ...

    if (__isascii(ac) && isprint(ac)) {
    // found and ASCII printable character !!!
    }


    The code works fine on standard ASCII char set - it doesn't seem to work with extended... for simple reason - in macro definitions of __isascii() and isprint(), the last possible character is 7F, and I need something like FF.

    Do you know of a function that would tell me that, for example, character 'ä' (or 'é'. etc) is a valid printable character for, say, French language?
    <|>

  2. #2
    Join Date
    Mar 2003
    Location
    NY-NJ
    Posts
    37
    oops.. for some reason characters " 'a' tilda' and "e accentuated" got converted to Slavic letters.. Don't know why, though.

    Anyway, even for these Slavic letters, standards ASCII routines will fail.
    <|>

  3. #3
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588
    It really depends on the codepage you are using. There are no "standard" functions to tell you whether a character is printable or not in a give codepage. Depending on what you want to use the codepages for, it might be better to write the program to handle text in Unicode internally and then figure out the printable characters from there (which is again not trivial unfortunately, but there is a lot of information on unicode.org).

    The quick and dirty fix would be to get the information about the codepages you want to support (there are quite a few, check for example the list of codepages supported by MS Windows ). In Windows you can use GetStringTypeEx(W) for determining what kind of characters your string is made of.

    If you are looking for a more platform independent solution, check out GNU's libiconv.

    To see the problems that arise, check for example this page. You will see that different codepages have "blanks" in different places, so you can never be really sure whether a character is printable in a give codepage or not, unless your program knows one way or another something about the codepage.
    Last edited by Yves M; April 14th, 2003 at 05:53 PM.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  4. #4
    Join Date
    Mar 2003
    Location
    NY-NJ
    Posts
    37
    Yves,

    Thanks for your suggestions!

    I am using MS Dev IDE, so libiconv will not help much, unfortunately. I am a little cautious about WinAPI, though, because I am coding SMTP traps, and they are a little lower-level than WinAPI. Still, I will try WinAPI solution just to see how it works.

    It's a good starting point, in any case.
    <|>

  5. #5
    Join Date
    May 2001
    Location
    Germany
    Posts
    1,158
    instead of the C-type isascii, maybe the C++-STL (see <ctype>) is of any help?

  6. #6
    Join Date
    Mar 2003
    Location
    NY-NJ
    Posts
    37
    Richard,

    I would love to try out your suggestion, but our software is written in C, so - no STL at my disposal, alas.

    Interesting suggestion, though!
    <|>

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured