-
May 16th, 2008, 05:11 AM
#1
XML encoding issue
Hello everyone,
Here is my code, and it will always output UTF-16 at XML header even if I set the XML declaration to UTF-8.
Here is my code and output.
My questions,
1. How to make UTF-8 in header other than UTF-16?
2. Is the XML string really UTF-16 encoded or UTF-8 encoded? I think in C#, string is always UTF-16 encoded, why do we need a UTF-8 in header?
Code:
<?xml version="1.0" encoding="utf-16"?>
<CategoryList a="12345" b="1d5458cd-a070-40cc-a3f4-cf3c394013cc" c="true" />
using System;
using System.Text;
using System.IO;
using System.Xml;
class Test
{
public static void Main()
{
XmlDocument xmlDoc = new XmlDocument();
// Write down the XML declaration
XmlDeclaration xmlDeclaration = xmlDoc.CreateXmlDeclaration("1.0", "utf-8", null);
// Create the root element
XmlElement rootNode = xmlDoc.CreateElement("CategoryList");
xmlDoc.InsertBefore(xmlDeclaration, xmlDoc.DocumentElement);
// Set attribute name and value!
rootNode.SetAttribute("a", "12345");
rootNode.SetAttribute("b", Guid.NewGuid().ToString());
rootNode.SetAttribute("c", "true");
xmlDoc.AppendChild(rootNode);
// Save to the XML file
StringWriter stream = new StringWriter();
xmlDoc.Save(stream);
string content = stream.ToString();
Console.Write(content);
return;
}
}
thanks in advance,
George
-
May 16th, 2008, 07:34 AM
#2
Re: XML encoding issue
1. The documentation of the StringWriter.Encoding gives you the answer for the changed encoding.
Originally Posted by MSDN
This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter. This allows the XML code to consume an arbitrary StringWriter and generate the correct XML header.
2. You can solve it by using a XmlWriter.
Code:
// Save to the XML file
MemoryStream stream = new MemoryStream();
XmlTextWriter writer = new XmlTextWriter(stream, Encoding.UTF8);
xmlDoc.Save(writer);
writer.Close();
string content = Encoding.UTF8.GetString(stream.ToArray());
3. The need of changing the encoding of a XML file is depending on the application which wants to read the file. A common scenario is you are writing the file in a .NET application and send it to a server. There is a PERL script located which are trying to parse your file.
Useful or not? Rate my posting. Thanks.
-
May 16th, 2008, 07:56 AM
#3
Re: XML encoding issue
Thanks torrud!
I understand and agree with 2 and 3. For 1, I am confused.
"This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter".
StringWriter is dealing with internal memory string, and C# is always using UTF-16 used as internal string/character representation/encoding schema.
From your reply, XML encoding should be used when we write the XML string into file or load from file. But StringWriter deals with pure internal memory strings and not dealing with files, why the encoding setting matters with StringWriter?
regards,
George
-
May 16th, 2008, 08:46 AM
#4
Re: XML encoding issue
Because the StringWriter class is derived from the TextWriter class the StringWriter have to implement the encoding property. And you created a full xml document, so the header should contains the right encoding. I can't see there any problem. But I see a problem for the reader if you write UTF-8 in the header and the document is coded in UTF-16.
Useful or not? Rate my posting. Thanks.
-
May 17th, 2008, 02:04 AM
#5
Re: XML encoding issue
Thanks torrud,
From study from your reply and related MSDN link, I think the solution should be setting encoding property of StringWriter.
I have tried the following code to solve, but it seems the encoding property can not be modified? Any ideas?
http://msdn.microsoft.com/en-us/libr...on(VS.80).aspx
Code:
// Save to the XML file
StringWriter stream = new StringWriter();
stream.Encoding = Encoding.UTF8;
xmlDoc.Save(stream);
string content = stream.ToString(); // error CS0200: Property or indexer 'System.IO.TextWriter.Encoding' cannot be assigned to -- it is read only
Console.Write(content);
Originally Posted by torrud
Because the StringWriter class is derived from the TextWriter class the StringWriter have to implement the encoding property. And you created a full xml document, so the header should contains the right encoding. I can't see there any problem. But I see a problem for the reader if you write UTF-8 in the header and the document is coded in UTF-16.
regards,
George
-
May 19th, 2008, 05:55 AM
#6
Re: XML encoding issue
Yes the Encoding property is read-only. Thats the reason because I am using the XmlWriter.
Useful or not? Rate my posting. Thanks.
-
May 20th, 2008, 09:03 AM
#7
Re: XML encoding issue
Thanks torrud,
I have found a solution today, by using MemoryStream. It could let us set encoding approach. :-)
Originally Posted by torrud
Yes the Encoding property is read-only. Thats the reason because I am using the XmlWriter.
regards,
George
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|