CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 7 of 7
  1. #1
    George2 is offline Elite Member Power Poster
    Join Date
    Oct 2002
    Posts
    4,468

    XML encoding issue

    Hello everyone,


    Here is my code, and it will always output UTF-16 at XML header even if I set the XML declaration to UTF-8.

    Here is my code and output.

    My questions,

    1. How to make UTF-8 in header other than UTF-16?
    2. Is the XML string really UTF-16 encoded or UTF-8 encoded? I think in C#, string is always UTF-16 encoded, why do we need a UTF-8 in header?

    Code:
    <?xml version="1.0" encoding="utf-16"?>
    <CategoryList a="12345" b="1d5458cd-a070-40cc-a3f4-cf3c394013cc" c="true" />
    
    using System;
    using System.Text;
    using System.IO;
    using System.Xml;
    
    class Test
    {
        public static void Main()
        {
            XmlDocument xmlDoc = new XmlDocument();
    
    
            // Write down the XML declaration
            XmlDeclaration xmlDeclaration = xmlDoc.CreateXmlDeclaration("1.0", "utf-8", null);
    
            // Create the root element
            XmlElement rootNode = xmlDoc.CreateElement("CategoryList");
            xmlDoc.InsertBefore(xmlDeclaration, xmlDoc.DocumentElement);
            // Set attribute name and value!
            rootNode.SetAttribute("a", "12345");
            rootNode.SetAttribute("b", Guid.NewGuid().ToString());
            rootNode.SetAttribute("c", "true");
            xmlDoc.AppendChild(rootNode);
    
            // Save to the XML file
            StringWriter stream = new StringWriter();
            xmlDoc.Save(stream);
            string content = stream.ToString();
            Console.Write(content);
    
            return;
        }
    }

    thanks in advance,
    George

  2. #2
    Join Date
    May 2003
    Location
    Germany
    Posts
    936

    Re: XML encoding issue

    1. The documentation of the StringWriter.Encoding gives you the answer for the changed encoding.
    Quote Originally Posted by MSDN
    This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter. This allows the XML code to consume an arbitrary StringWriter and generate the correct XML header.
    2. You can solve it by using a XmlWriter.
    Code:
                // Save to the XML file
                MemoryStream stream = new MemoryStream();
    
                XmlTextWriter writer = new XmlTextWriter(stream, Encoding.UTF8);
                xmlDoc.Save(writer);
                writer.Close();
                
                string content = Encoding.UTF8.GetString(stream.ToArray());
    3. The need of changing the encoding of a XML file is depending on the application which wants to read the file. A common scenario is you are writing the file in a .NET application and send it to a server. There is a PERL script located which are trying to parse your file.
    Useful or not? Rate my posting. Thanks.

  3. #3
    George2 is offline Elite Member Power Poster
    Join Date
    Oct 2002
    Posts
    4,468

    Re: XML encoding issue

    Thanks torrud!


    I understand and agree with 2 and 3. For 1, I am confused.

    "This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter".

    StringWriter is dealing with internal memory string, and C# is always using UTF-16 used as internal string/character representation/encoding schema.

    From your reply, XML encoding should be used when we write the XML string into file or load from file. But StringWriter deals with pure internal memory strings and not dealing with files, why the encoding setting matters with StringWriter?


    regards,
    George

  4. #4
    Join Date
    May 2003
    Location
    Germany
    Posts
    936

    Re: XML encoding issue

    Because the StringWriter class is derived from the TextWriter class the StringWriter have to implement the encoding property. And you created a full xml document, so the header should contains the right encoding. I can't see there any problem. But I see a problem for the reader if you write UTF-8 in the header and the document is coded in UTF-16.
    Useful or not? Rate my posting. Thanks.

  5. #5
    George2 is offline Elite Member Power Poster
    Join Date
    Oct 2002
    Posts
    4,468

    Re: XML encoding issue

    Thanks torrud,


    From study from your reply and related MSDN link, I think the solution should be setting encoding property of StringWriter.

    I have tried the following code to solve, but it seems the encoding property can not be modified? Any ideas?

    http://msdn.microsoft.com/en-us/libr...on(VS.80).aspx

    Code:
            // Save to the XML file
            StringWriter stream = new StringWriter();
            stream.Encoding  = Encoding.UTF8;
            xmlDoc.Save(stream);
            string content = stream.ToString(); // error CS0200: Property or indexer 'System.IO.TextWriter.Encoding' cannot be assigned to -- it is read only
            Console.Write(content);
    Quote Originally Posted by torrud
    Because the StringWriter class is derived from the TextWriter class the StringWriter have to implement the encoding property. And you created a full xml document, so the header should contains the right encoding. I can't see there any problem. But I see a problem for the reader if you write UTF-8 in the header and the document is coded in UTF-16.

    regards,
    George

  6. #6
    Join Date
    May 2003
    Location
    Germany
    Posts
    936

    Re: XML encoding issue

    Yes the Encoding property is read-only. Thats the reason because I am using the XmlWriter.
    Useful or not? Rate my posting. Thanks.

  7. #7
    George2 is offline Elite Member Power Poster
    Join Date
    Oct 2002
    Posts
    4,468

    Re: XML encoding issue

    Thanks torrud,


    I have found a solution today, by using MemoryStream. It could let us set encoding approach. :-)

    Quote Originally Posted by torrud
    Yes the Encoding property is read-only. Thats the reason because I am using the XmlWriter.

    regards,
    George

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured