I am working with strings in a program that I am developing in C++ using Geany (Linux/Ubuntu). The main issue is that I have to work with texts written in Spanish, so they contain characters such as "á", "é" or "ñ".
One of the functions that I need most is to find the ASII code of each character in a string. I have found a way to do this:
segmento="á, é, ó are characters that can be found in this sentence";
This is, I convert the string to a char variable and then I get the ASCII codes. The problem is that this does not seem to work correctly with accented characters. For instance, the character "í" seems to be split in two "chars" with two ASCII codes, -61 and -83, when it should be just one code: 237. I think that this is because I get the strings reading UTF-8 files, but this is something that I can not change.
Could someone please help me find a way to get the right ASCII code for the characters of a string, even the accented letters?
Re: Working with accented characters in C++ (Linux)
It would make more sense to talk about the Unicode code point. That's a far less restrictive system, and it corresponds to ASCII between 0 and 127.
The accented characters may be represented by Extended ASCII (128-255), but this range does *not* correspond directly to the corresponding Unicode code point---this is likely your problem.
Either you need to figure out the mapping from those code points to Extended ASCII, or you need to disregard ASCII and just stick to code points all around. I suggest the latter. Of course, as has been asked, what you need it for is important.