Working with accented characters in C++ (Linux)

**granadajose** · June 5th, 2009, 01:45 AM

Hi,

I am working with strings in a program that I am developing in C++ using Geany (Linux/Ubuntu). The main issue is that I have to work with texts written in Spanish, so they contain characters such as "Ã¡", "Ã©" or "Ã±".

One of the functions that I need most is to find the ASII code of each character in a string. I have found a way to do this:

//
char caracteres[128];
string segmento;
int asciicode;

segmento="Ã¡, Ã©, Ã³ are characters that can be found in this sentence";
strcpy(caracteres, segmento.c_str());

asciicode=int(caracteres[1]);
//

This is, I convert the string to a char variable and then I get the ASCII codes. The problem is that this does not seem to work correctly with accented characters. For instance, the character "Ã*" seems to be split in two "chars" with two ASCII codes, -61 and -83, when it should be just one code: 237. I think that this is because I get the strings reading UTF-8 files, but this is something that I can not change.

Could someone please help me find a way to get the right ASCII code for the characters of a string, even the accented letters?

Many thanks!!!

**_Superman_** · June 5th, 2009, 02:32 AM

You could try _mbscpy instead of strcpy.

**Codeplug** · June 5th, 2009, 08:07 AM

Why do you want the "ASCII code"? What do you want to do with this information?

gg

**Lindley** · June 5th, 2009, 08:36 AM

It would make more sense to talk about the Unicode code point. That's a far less restrictive system, and it corresponds to ASCII between 0 and 127.

The accented characters may be represented by Extended ASCII (128-255), but this range does *not* correspond directly to the corresponding Unicode code point---this is likely your problem.

Either you need to figure out the mapping from those code points to Extended ASCII, or you need to disregard ASCII and just stick to code points all around. I suggest the latter. Of course, as has been asked, what you need it for is important.

Thread: Working with accented characters in C++ (Linux)

Thread Tools

Display

Working with accented characters in C++ (Linux)

Re: Working with accented characters in C++ (Linux)

Re: Working with accented characters in C++ (Linux)

Re: Working with accented characters in C++ (Linux)

Tags for this Thread

Posting Permissions