|
-
June 25th, 2006, 07:38 AM
#1
Unicode CString to HTML characters
I use tree languages in my MFC application. The application compiled as UNICODE. I need generate html with some text from the controls, and I want create regular html file that contain html characters instead UNICODE, for example the russian letter 'A' will be written as =& # 1040 ; (without spaces)
Something like this (IDC_EDIT_TXT contains Unicode text):
Code:
CString str;
CStdioFile fileU(_T("c:\\temp\\unicode.html"),CFile::modeCreate|CFile::modeWrite);
GetDlgItemText(IDC_EDIT_TXT,str);
fileU.WriteString(str);
I tried convert the string with this function:
Code:
char szString [1000];
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,szString,sizeof(szString),NULL,NULL);
fileU.Write(szString,len);
but the result was binary unicode string.
Do you know how to convert it do html characters?
Last edited by hershs; June 26th, 2006 at 02:55 AM.
Reason: Unicode not seen
-
June 25th, 2006, 12:59 PM
#2
Re: Unicode CString to HTML characters
Sorry, but what exactly do you understand by "html characters"?
-
June 25th, 2006, 03:23 PM
#3
Re: Unicode CString to HTML characters
I mean 'A'=& # 1040 ; (without spaces)
-
June 25th, 2006, 03:33 PM
#4
Re: Unicode CString to HTML characters
Not sure if you are looking for something like this:
http://www.codeguru.com/cpp/cpp/cpp_...e.php/c4029__1
-
June 25th, 2006, 03:39 PM
#5
Re: Unicode CString to HTML characters
No, I need convert unicode characters in html content. Your link talks about unsafe characters in URL address.
-
June 26th, 2006, 12:43 AM
#6
Re: Unicode CString to HTML characters
 Originally Posted by hershs
No, I need convert unicode characters in html content. Your link talks about unsafe characters in URL address.
If you are trying to convert UNICODE chars to ANSI chars, most UNICODE characters don't map well without data loss. You can convert UNICODE into UTF-8 or UTF-16 which can be displayed in html. Is this what you are looking for?
-
June 26th, 2006, 03:43 AM
#7
Re: Unicode CString to HTML characters
 Originally Posted by hershs
Something like this (IDC_EDIT_TXT contains Unicode text):
Code:
CString str;
CStdioFile fileU(_T("c:\\temp\\unicode.html"),CFile::modeCreate|CFile::modeWrite);
GetDlgItemText(IDC_EDIT_TXT,str);
fileU.WriteString(str);
I tried convert the string with this function:
Code:
char szString [1000];
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,szString,sizeof(szString),NULL,NULL);
fileU.Write(szString,len);
but the result was binary unicode string.
Do you know how to convert it do html characters?
Your example will work if you identify your output page as UTF-8. See http://www.cl.cam.ac.uk/~mgk25/unicode.html#web for advice on how to do that.
-
June 26th, 2006, 05:58 AM
#8
Re: Unicode CString to HTML characters
 Originally Posted by Andrew Hain
Regards,
Ramkrishna Pawar
-
June 26th, 2006, 06:31 AM
#9
Re: Unicode CString to HTML characters
 Originally Posted by Andrew Hain
Of cource I added UTF-8 identifier to HTML
Code:
<META http-equiv=Content-Type content="text/html; charset=UTF-8">
But how I can tell to MFC write four characters: "& # 1040;" instead russian 'A' ?
-
June 26th, 2006, 12:06 PM
#10
Re: Unicode CString to HTML characters
 Originally Posted by hershs
Of cource I added UTF-8 identifier to HTML
Code:
<META http-equiv=Content-Type content="text/html; charset=UTF-8">
But how I can tell to MFC write four characters: "& # 1040;" instead russian 'A' ?
You need to encode it. As far as I know there aren't any built in C++ libraries available from Microsoft like there is for JavaScript's Encode() function or .Net's HttpUtility.HtmlEncode.
For more home grown solutions, search google for "C++ encode UTF-8".
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|