在 C# 中将特殊字符添加到 XML innertext 时的转换
Conversion of the special characters while adding it to the XML innertext in C#
在写内文的时候需要用到特殊字符的十六进制编码,但是无法添加。我尝试了一些编码更改,但它不起作用。
我需要像
这样的输出
–CO–OR
而不是 "–CO–OR"
"+"
而不是 "+"
下面提供了我尝试转换的代码。
else
{
//convertedStr = System.Net.WebUtility.HtmlDecode(runText);
Encoding iso = Encoding.Default;
Encoding utf8 = Encoding.Unicode;
byte[] utfBytes = utf8.GetBytes(runText);
byte[] isoBytes = Encoding.Convert(iso, utf8, utfBytes);
string msg = iso.GetString(isoBytes);
eqnPartElm = clsGlobal.XMLDoc.CreateElement("inf");
eqnPartElm.InnerText = msg;
eqnElm.AppendChild(eqnPartElm);
}
Unicode 字符的转义不受 XmlDocument
控制。相反,XmlWriter
will escape characters not supported by the current encoding, as specified by XmlWriterSettings.Encoding
, at the time the document is written to a stream. If you want all "special characters" such as the En Dash to be escaped, choose a very restrictive encoding such as Encoding.ASCII
.
要轻松做到这一点,请创建以下扩展方法:
public static class XmlSerializationHelper
{
public static string GetOuterXml(this XmlNode node, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false)
{
if (node == null)
return null;
using var stream = new MemoryStream();
node.Save(stream, indent : indent, encoding : encoding, omitXmlDeclaration : omitXmlDeclaration, closeOutput : false);
stream.Position = 0;
using var reader = new StreamReader(stream);
return reader.ReadToEnd();
}
public static void Save(this XmlNode node, Stream stream, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false, bool closeOutput = true) =>
node.Save(stream, new XmlWriterSettings
{
Indent = indent,
Encoding = encoding,
OmitXmlDeclaration = omitXmlDeclaration,
CloseOutput = closeOutput,
});
public static void Save(this XmlNode node, Stream stream, XmlWriterSettings settings)
{
using (var xmlWriter = XmlWriter.Create(stream, settings))
{
node.WriteTo(xmlWriter);
}
}
}
现在您可以执行以下操作,将 XmlDocument
序列化为转义了 non-ASCII 个字符的字符串:
// Construct your XmlDocument (not shown in the question)
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml("<Root></Root>");
var eqnPartElm = xmlDoc.CreateElement("inf");
xmlDoc.DocumentElement.AppendChild(eqnPartElm);
// Add some non-ASCII text (here – is an En Dash character).
eqnPartElm.InnerText = "–CO–OR";
// Output to XML and escape all non-ASCII characters.
var xml = xmlDoc.GetOuterXml(indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
要序列化为 Stream
,请执行:
using (var stream = new FileStream(fileName, FileMode.OpenOrCreate))
{
xmlDoc.Save(stream, indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
}
并且将创建以下 XML:
<Root>
<inf>–CO–OR</inf>
</Root>
请注意,您必须使用新的 XmlWriter
而不是旧的 XmlTextWriter
,因为后者不支持用转义后备替换不受支持的字符。
演示 fiddle here.
在写内文的时候需要用到特殊字符的十六进制编码,但是无法添加。我尝试了一些编码更改,但它不起作用。 我需要像
这样的输出–CO–OR
而不是 "–CO–OR"
"+"
而不是 "+"
下面提供了我尝试转换的代码。
else
{
//convertedStr = System.Net.WebUtility.HtmlDecode(runText);
Encoding iso = Encoding.Default;
Encoding utf8 = Encoding.Unicode;
byte[] utfBytes = utf8.GetBytes(runText);
byte[] isoBytes = Encoding.Convert(iso, utf8, utfBytes);
string msg = iso.GetString(isoBytes);
eqnPartElm = clsGlobal.XMLDoc.CreateElement("inf");
eqnPartElm.InnerText = msg;
eqnElm.AppendChild(eqnPartElm);
}
Unicode 字符的转义不受 XmlDocument
控制。相反,XmlWriter
will escape characters not supported by the current encoding, as specified by XmlWriterSettings.Encoding
, at the time the document is written to a stream. If you want all "special characters" such as the En Dash to be escaped, choose a very restrictive encoding such as Encoding.ASCII
.
要轻松做到这一点,请创建以下扩展方法:
public static class XmlSerializationHelper
{
public static string GetOuterXml(this XmlNode node, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false)
{
if (node == null)
return null;
using var stream = new MemoryStream();
node.Save(stream, indent : indent, encoding : encoding, omitXmlDeclaration : omitXmlDeclaration, closeOutput : false);
stream.Position = 0;
using var reader = new StreamReader(stream);
return reader.ReadToEnd();
}
public static void Save(this XmlNode node, Stream stream, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false, bool closeOutput = true) =>
node.Save(stream, new XmlWriterSettings
{
Indent = indent,
Encoding = encoding,
OmitXmlDeclaration = omitXmlDeclaration,
CloseOutput = closeOutput,
});
public static void Save(this XmlNode node, Stream stream, XmlWriterSettings settings)
{
using (var xmlWriter = XmlWriter.Create(stream, settings))
{
node.WriteTo(xmlWriter);
}
}
}
现在您可以执行以下操作,将 XmlDocument
序列化为转义了 non-ASCII 个字符的字符串:
// Construct your XmlDocument (not shown in the question)
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml("<Root></Root>");
var eqnPartElm = xmlDoc.CreateElement("inf");
xmlDoc.DocumentElement.AppendChild(eqnPartElm);
// Add some non-ASCII text (here – is an En Dash character).
eqnPartElm.InnerText = "–CO–OR";
// Output to XML and escape all non-ASCII characters.
var xml = xmlDoc.GetOuterXml(indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
要序列化为 Stream
,请执行:
using (var stream = new FileStream(fileName, FileMode.OpenOrCreate))
{
xmlDoc.Save(stream, indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
}
并且将创建以下 XML:
<Root>
<inf>–CO–OR</inf>
</Root>
请注意,您必须使用新的 XmlWriter
而不是旧的 XmlTextWriter
,因为后者不支持用转义后备替换不受支持的字符。
演示 fiddle here.