December 26th, 2009, 08:59 PM
how to identify the charset of MBCS
In our project we create an editor.User can add the charset supported by the editor(ex. eng, S-JIS, EUC, GB2312, BIG5...), and we use a list to store the charsets. The document saved by the editor is in multi-byte string. What we want is when the editor to open the document next time, it can identify the charset of the document automatically.
Now, we want to do that like this:try to use each charset in the list to call the setlocale() function, and use _mbbtype() to see whether every char in the document is SINGLE、LEAD or TRAIL char.If it is ILEGAL, we try the next charset. The problem is the _mbbtype does not work well. it can not identify the charset of MBCS, whatever param you pass to the setlocale(), _mbbtype just return OK.
Any good ideas?
Click Here to Expand Forum to Full Width