XmlWriter 在打印到控制台时输出意外的编码
XmlWriter outputs unexpected encoding when printing to console
在学习 C# 时 XmlWriter
class 我遇到了一些奇怪的行为。当输出到文件时,它按预期使用 UTF-8,但当输出到控制台时,它使用我的系统代码页 (862) 而不是 UTF-8。我知道控制台不支持 UTF-8,所以我想知道如果流是控制台,XmlWriter
是否默认为系统代码页?
重现行为的代码:
using System;
using System.Xml;
namespace xmlwriterTest
{
class Program
{
static void Main(string[] args)
{
var settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine("\n\n");
}
}
}
}
预期输出:
<?xml version="1.0" encoding="utf-8"?>
实际输出:
<?xml version="1.0" encoding="Codepage - 862"?>
编辑
我的问题不是如何在控制台中实现预期的输出。首先是为什么会发生这种情况
根据评论更新
为什么首先会发生这种情况:因为 Console.Out
的默认 Encoding
不是 UTF-8
,我正在做一个像这样测试:
1.1 - 默认 Encoding
:
TextWriter textWriter = Console.Out;
Console.WriteLine(textWriter.Encoding.BodyName);
1.2 - 结果:
ibm850
2.1 - Console.Out
的自定义 Encoding
:
Console.OutputEncoding = Encoding.UTF8;
TextWriter textWriter = Console.Out;
Console.WriteLine(textWriter.Encoding.BodyName);
2.2 - 结果:
utf-8
整个代码:
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
Console.OutputEncoding = Encoding.UTF8;
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine();
}
结果:
<?xml version="1.0" encoding="utf-8"?>
<users>
<user age="42">John Doe</user>
<user age="39">Jane Doe</user>
</users>
旧代码
您可以使用 StringBuilder
和 override Encoding
属性 创建自定义 StringWriter
,例如:
1 - 自定义 class:
public class EncodedStringWriter : StringWriter
{
public EncodedStringWriter(StringBuilder sb, Encoding encoding)
: base(sb)
{
_Encoding = encoding;
}
private readonly Encoding _Encoding;
public override Encoding Encoding => _Encoding;
}
2 - 使用 自定义 stringWriter 构建 XML:
StringBuilder sb = new StringBuilder();
EncodedStringWriter stringWriter = new EncodedStringWriter(sb, Encoding.UTF8);
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create(stringWriter, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine(sb.ToString());
}
结果
<?xml version="1.0" encoding="utf-8"?>
<users>
<user age="42">John Doe</user>
<user age="39">Jane Doe</user>
</users>
希望对您有所帮助。
在学习 C# 时 XmlWriter
class 我遇到了一些奇怪的行为。当输出到文件时,它按预期使用 UTF-8,但当输出到控制台时,它使用我的系统代码页 (862) 而不是 UTF-8。我知道控制台不支持 UTF-8,所以我想知道如果流是控制台,XmlWriter
是否默认为系统代码页?
重现行为的代码:
using System;
using System.Xml;
namespace xmlwriterTest
{
class Program
{
static void Main(string[] args)
{
var settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine("\n\n");
}
}
}
}
预期输出:
<?xml version="1.0" encoding="utf-8"?>
实际输出:
<?xml version="1.0" encoding="Codepage - 862"?>
编辑 我的问题不是如何在控制台中实现预期的输出。首先是为什么会发生这种情况
根据评论更新
为什么首先会发生这种情况:因为 Console.Out
的默认 Encoding
不是 UTF-8
,我正在做一个像这样测试:
1.1 - 默认 Encoding
:
TextWriter textWriter = Console.Out;
Console.WriteLine(textWriter.Encoding.BodyName);
1.2 - 结果:
ibm850
2.1 - Console.Out
的自定义 Encoding
:
Console.OutputEncoding = Encoding.UTF8;
TextWriter textWriter = Console.Out;
Console.WriteLine(textWriter.Encoding.BodyName);
2.2 - 结果:
utf-8
整个代码:
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
Console.OutputEncoding = Encoding.UTF8;
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine();
}
结果:
<?xml version="1.0" encoding="utf-8"?>
<users>
<user age="42">John Doe</user>
<user age="39">Jane Doe</user>
</users>
旧代码
您可以使用 StringBuilder
和 override Encoding
属性 创建自定义 StringWriter
,例如:
1 - 自定义 class:
public class EncodedStringWriter : StringWriter
{
public EncodedStringWriter(StringBuilder sb, Encoding encoding)
: base(sb)
{
_Encoding = encoding;
}
private readonly Encoding _Encoding;
public override Encoding Encoding => _Encoding;
}
2 - 使用 自定义 stringWriter 构建 XML:
StringBuilder sb = new StringBuilder();
EncodedStringWriter stringWriter = new EncodedStringWriter(sb, Encoding.UTF8);
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create(stringWriter, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("users");
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "42");
writer.WriteString("John Doe");
writer.WriteEndElement();
writer.WriteStartElement("user");
writer.WriteAttributeString("age", "39");
writer.WriteString("Jane Doe");
writer.WriteEndDocument();
writer.Close();
Console.WriteLine(sb.ToString());
}
结果
<?xml version="1.0" encoding="utf-8"?>
<users>
<user age="42">John Doe</user>
<user age="39">Jane Doe</user>
</users>
希望对您有所帮助。