Unicode CString to HTML characters
I use tree languages in my MFC application. The application compiled as UNICODE. I need generate html with some text from the controls, and I want create regular html file that contain html characters instead UNICODE, for example the russian letter 'A' will be written as =& # 1040 ; (without spaces)
Something like this (IDC_EDIT_TXT contains Unicode text):
Code:
CString str;
CStdioFile fileU(_T("c:\\temp\\unicode.html"),CFile::modeCreate|CFile::modeWrite);
GetDlgItemText(IDC_EDIT_TXT,str);
fileU.WriteString(str);
I tried convert the string with this function:
Code:
char szString [1000];
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,szString,sizeof(szString),NULL,NULL);
fileU.Write(szString,len);
but the result was binary unicode string.
Do you know how to convert it do html characters?
Re: Unicode CString to HTML characters
Sorry, but what exactly do you understand by "html characters"?
Re: Unicode CString to HTML characters
I mean 'A'=& # 1040 ; (without spaces)
Re: Unicode CString to HTML characters
Not sure if you are looking for something like this:
http://www.codeguru.com/cpp/cpp/cpp_...e.php/c4029__1
Re: Unicode CString to HTML characters
No, I need convert unicode characters in html content. Your link talks about unsafe characters in URL address.
Re: Unicode CString to HTML characters
Quote:
Originally Posted by hershs
No, I need convert unicode characters in html content. Your link talks about unsafe characters in URL address.
If you are trying to convert UNICODE chars to ANSI chars, most UNICODE characters don't map well without data loss. You can convert UNICODE into UTF-8 or UTF-16 which can be displayed in html. Is this what you are looking for?
Re: Unicode CString to HTML characters
Quote:
Originally Posted by hershs
Something like this (IDC_EDIT_TXT contains Unicode text):
Code:
CString str;
CStdioFile fileU(_T("c:\\temp\\unicode.html"),CFile::modeCreate|CFile::modeWrite);
GetDlgItemText(IDC_EDIT_TXT,str);
fileU.WriteString(str);
I tried convert the string with this function:
Code:
char szString [1000];
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,szString,sizeof(szString),NULL,NULL);
fileU.Write(szString,len);
but the result was binary unicode string.
Do you know how to convert it do html characters?
Your example will work if you identify your output page as UTF-8. See http://www.cl.cam.ac.uk/~mgk25/unicode.html#web for advice on how to do that.
Re: Unicode CString to HTML characters
Quote:
Originally Posted by Andrew Hain
:thumb:
Re: Unicode CString to HTML characters
Quote:
Originally Posted by Andrew Hain
Of cource I added UTF-8 identifier to HTML
Code:
<META http-equiv=Content-Type content="text/html; charset=UTF-8">
But how I can tell to MFC write four characters: "& # 1040;" instead russian 'A' ?
Re: Unicode CString to HTML characters
Quote:
Originally Posted by hershs
Of cource I added UTF-8 identifier to HTML
Code:
<META http-equiv=Content-Type content="text/html; charset=UTF-8">
But how I can tell to MFC write four characters: "& # 1040;" instead russian 'A' ?
You need to encode it. As far as I know there aren't any built in C++ libraries available from Microsoft like there is for JavaScript's Encode() function or .Net's HttpUtility.HtmlEncode.
For more home grown solutions, search google for "C++ encode UTF-8".