将 char[] 缓冲区传递给 XmlSerializer
Pass a char[] buffer to a XmlSerializer
我有一个 XML 存储在一个 char 数组中 - char[]
- 我在一个 int 变量中有数据的内容长度。我需要使用 XmlSerializer 反序列化数据。
出于性能原因,我需要避免分配字符串对象,因为数据通常 >85kb 并且会生成 Gen2 对象。
有没有办法将 char[] 传递给 XmlSerializer
而无需将其转换为字符串?它接受 Stream
或 TextReader
但我找不到从 char[]
.
构造一个的方法
我正在想象这样的事情(除了 C# 没有 CharArrayStream 或 CharArrayReader):
public MyEntity DeserializeXmlDocument(char [] buffer, int contentLength) {
using (var stream = new CharArrayStream(buffer, contentLength))
{
return _xmlSerializer.Deserialize(stream) as MyEntity;
}
}
正如更多信息一样,我们正处于分析现有代码并确定痛点的时刻,因此这不是 "premature optimization" 或 "XY problem" 的情况。
我将@György Kőszeg 链接的代码修改为 class CharArrayStream。到目前为止,这在我的测试中有效:
public class CharArrayStream : Stream
{
private readonly char[] str;
private readonly int n;
public override bool CanRead => true;
public override bool CanSeek => true;
public override bool CanWrite => false;
public override long Length => n;
public override long Position { get; set; } // TODO: bounds check
public CharArrayStream(char[] str, int n)
{
this.str = str;
this.n = n;
}
public override long Seek(long offset, SeekOrigin origin)
{
switch (origin)
{
case SeekOrigin.Begin:
Position = offset;
break;
case SeekOrigin.Current:
Position += offset;
break;
case SeekOrigin.End:
Position = Length - offset;
break;
}
return Position;
}
private byte this[int i] => (byte)str[i];
public override int Read(byte[] buffer, int offset, int count)
{
// TODO: bounds check
var len = Math.Min(count, Length - Position);
for (int i = 0; i < len; i++)
{
buffer[offset++] = this[(int)(Position++)];
}
return (int)len;
}
public override int ReadByte() => Position >= Length ? -1 : this[(int)Position++];
public override void Flush() { }
public override void SetLength(long value) => throw new NotSupportedException();
public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
public override string ToString() => throw new NotSupportedException();
}
我可以这样使用:
public MyEntity DeserializeXmlDocument(char [] buffer, int contentLength) {
using (var stream = new CharArrayStream(buffer, contentLength))
{
return _xmlSerializer.Deserialize(stream) as MyEntity;
}
}
谢谢@György Kőszeg!
将 TextReader
子类化以从字符数组或等效数组中读取数据相当简单。这是一个采用 ReadOnlyMemory<char>
的版本,它可以表示 string
或 char []
字符数组的一部分:
public sealed class CharMemoryReader : TextReader
{
private ReadOnlyMemory<char> chars;
private int position;
public CharMemoryReader(ReadOnlyMemory<char> chars)
{
this.chars = chars;
this.position = 0;
}
void CheckClosed()
{
if (position < 0)
throw new ObjectDisposedException(null, string.Format("{0} is closed.", ToString()));
}
public override void Close() => Dispose(true);
protected override void Dispose(bool disposing)
{
chars = ReadOnlyMemory<char>.Empty;
position = -1;
base.Dispose(disposing);
}
public override int Peek()
{
CheckClosed();
return position >= chars.Length ? -1 : chars.Span[position];
}
public override int Read()
{
CheckClosed();
return position >= chars.Length ? -1 : chars.Span[position++];
}
public override int Read(char[] buffer, int index, int count)
{
CheckClosed();
if (buffer == null)
throw new ArgumentNullException(nameof(buffer));
if (index < 0)
throw new ArgumentOutOfRangeException(nameof(index));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count));
if (buffer.Length - index < count)
throw new ArgumentException("buffer.Length - index < count");
return Read(buffer.AsSpan().Slice(index, count));
}
public override int Read(Span<char> buffer)
{
CheckClosed();
var nRead = chars.Length - position;
if (nRead > 0)
{
if (nRead > buffer.Length)
nRead = buffer.Length;
chars.Span.Slice(position, nRead).CopyTo(buffer);
position += nRead;
}
return nRead;
}
public override string ReadToEnd()
{
CheckClosed();
var s = position == 0 ? chars.ToString() : chars.Slice(position, chars.Length - position).ToString();
position = chars.Length;
return s;
}
public override string ReadLine()
{
CheckClosed();
var span = chars.Span;
var i = position;
for( ; i < span.Length; i++)
{
var ch = span[i];
if (ch == '\r' || ch == '\n')
{
var result = span.Slice(position, i - position).ToString();
position = i + 1;
if (ch == '\r' && position < span.Length && span[position] == '\n')
position++;
return result;
}
}
if (i > position)
{
var result = span.Slice(position, i - position).ToString();
position = i;
return result;
}
return null;
}
public override int ReadBlock(char[] buffer, int index, int count) => Read(buffer, index, count);
public override int ReadBlock(Span<char> buffer) => Read(buffer);
public override Task<String> ReadLineAsync() => Task.FromResult(ReadLine());
public override Task<String> ReadToEndAsync() => Task.FromResult(ReadToEnd());
public override Task<int> ReadBlockAsync(char[] buffer, int index, int count) => Task.FromResult(ReadBlock(buffer, index, count));
public override Task<int> ReadAsync(char[] buffer, int index, int count) => Task.FromResult(Read(buffer, index, count));
public override ValueTask<int> ReadBlockAsync(Memory<char> buffer, CancellationToken cancellationToken = default) =>
cancellationToken.IsCancellationRequested ? new ValueTask<int>(Task.FromCanceled<int>(cancellationToken)) : new ValueTask<int>(ReadBlock(buffer.Span));
public override ValueTask<int> ReadAsync(Memory<char> buffer, CancellationToken cancellationToken = default) =>
cancellationToken.IsCancellationRequested ? new ValueTask<int>(Task.FromCanceled<int>(cancellationToken)) : new ValueTask<int>(Read(buffer.Span));
}
然后将其与以下扩展方法之一一起使用:
public static partial class XmlSerializationHelper
{
public static T LoadFromXml<T>(this char [] xml, int contentLength, XmlSerializer serial = null) =>
new ReadOnlyMemory<char>(xml, 0, contentLength).LoadFromXml<T>(serial);
public static T LoadFromXml<T>(this ReadOnlyMemory<char> xml, XmlSerializer serial = null)
{
serial = serial ?? new XmlSerializer(typeof(T));
using (var reader = new CharMemoryReader(xml))
return (T)serial.Deserialize(reader);
}
}
例如
var result = buffer.LoadFromXml<MyEntity>(contentLength, _xmlSerializer);
备注:
一个char []
字符数组与没有BOM, so one could create a custom Stream
implementation resembling MemoryStream
that represents each char
as two bytes, as is done in this answer to How do I generate a stream from a string? by György Kőszeg的UTF-16编码内存流的内容基本相同。然而,完全正确地执行此操作看起来有点棘手,因为正确设置所有 async
方法似乎很重要。
完成后 XmlReader
仍需要使用 StreamReader
将自定义流包装成 "decodes" 字符序列,正确推断过程中的编码(我观察到这有时可能会被错误地完成,例如当编码声明 XML 声明与实际编码不匹配时)。
我选择创建自定义 TextReader
而不是自定义 Stream
以避免不必要的解码步骤,并且因为 async
实施似乎不那么麻烦。
通过截断(例如 (byte)str[i]
)将每个 char
表示为单个字节将损坏包含任何多字节字符的 XML。
我没有对上面的实现做任何性能调优。
演示 fiddle here.
我有一个 XML 存储在一个 char 数组中 - char[]
- 我在一个 int 变量中有数据的内容长度。我需要使用 XmlSerializer 反序列化数据。
出于性能原因,我需要避免分配字符串对象,因为数据通常 >85kb 并且会生成 Gen2 对象。
有没有办法将 char[] 传递给 XmlSerializer
而无需将其转换为字符串?它接受 Stream
或 TextReader
但我找不到从 char[]
.
我正在想象这样的事情(除了 C# 没有 CharArrayStream 或 CharArrayReader):
public MyEntity DeserializeXmlDocument(char [] buffer, int contentLength) {
using (var stream = new CharArrayStream(buffer, contentLength))
{
return _xmlSerializer.Deserialize(stream) as MyEntity;
}
}
正如更多信息一样,我们正处于分析现有代码并确定痛点的时刻,因此这不是 "premature optimization" 或 "XY problem" 的情况。
我将@György Kőszeg 链接的代码修改为 class CharArrayStream。到目前为止,这在我的测试中有效:
public class CharArrayStream : Stream
{
private readonly char[] str;
private readonly int n;
public override bool CanRead => true;
public override bool CanSeek => true;
public override bool CanWrite => false;
public override long Length => n;
public override long Position { get; set; } // TODO: bounds check
public CharArrayStream(char[] str, int n)
{
this.str = str;
this.n = n;
}
public override long Seek(long offset, SeekOrigin origin)
{
switch (origin)
{
case SeekOrigin.Begin:
Position = offset;
break;
case SeekOrigin.Current:
Position += offset;
break;
case SeekOrigin.End:
Position = Length - offset;
break;
}
return Position;
}
private byte this[int i] => (byte)str[i];
public override int Read(byte[] buffer, int offset, int count)
{
// TODO: bounds check
var len = Math.Min(count, Length - Position);
for (int i = 0; i < len; i++)
{
buffer[offset++] = this[(int)(Position++)];
}
return (int)len;
}
public override int ReadByte() => Position >= Length ? -1 : this[(int)Position++];
public override void Flush() { }
public override void SetLength(long value) => throw new NotSupportedException();
public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
public override string ToString() => throw new NotSupportedException();
}
我可以这样使用:
public MyEntity DeserializeXmlDocument(char [] buffer, int contentLength) {
using (var stream = new CharArrayStream(buffer, contentLength))
{
return _xmlSerializer.Deserialize(stream) as MyEntity;
}
}
谢谢@György Kőszeg!
将 TextReader
子类化以从字符数组或等效数组中读取数据相当简单。这是一个采用 ReadOnlyMemory<char>
的版本,它可以表示 string
或 char []
字符数组的一部分:
public sealed class CharMemoryReader : TextReader
{
private ReadOnlyMemory<char> chars;
private int position;
public CharMemoryReader(ReadOnlyMemory<char> chars)
{
this.chars = chars;
this.position = 0;
}
void CheckClosed()
{
if (position < 0)
throw new ObjectDisposedException(null, string.Format("{0} is closed.", ToString()));
}
public override void Close() => Dispose(true);
protected override void Dispose(bool disposing)
{
chars = ReadOnlyMemory<char>.Empty;
position = -1;
base.Dispose(disposing);
}
public override int Peek()
{
CheckClosed();
return position >= chars.Length ? -1 : chars.Span[position];
}
public override int Read()
{
CheckClosed();
return position >= chars.Length ? -1 : chars.Span[position++];
}
public override int Read(char[] buffer, int index, int count)
{
CheckClosed();
if (buffer == null)
throw new ArgumentNullException(nameof(buffer));
if (index < 0)
throw new ArgumentOutOfRangeException(nameof(index));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count));
if (buffer.Length - index < count)
throw new ArgumentException("buffer.Length - index < count");
return Read(buffer.AsSpan().Slice(index, count));
}
public override int Read(Span<char> buffer)
{
CheckClosed();
var nRead = chars.Length - position;
if (nRead > 0)
{
if (nRead > buffer.Length)
nRead = buffer.Length;
chars.Span.Slice(position, nRead).CopyTo(buffer);
position += nRead;
}
return nRead;
}
public override string ReadToEnd()
{
CheckClosed();
var s = position == 0 ? chars.ToString() : chars.Slice(position, chars.Length - position).ToString();
position = chars.Length;
return s;
}
public override string ReadLine()
{
CheckClosed();
var span = chars.Span;
var i = position;
for( ; i < span.Length; i++)
{
var ch = span[i];
if (ch == '\r' || ch == '\n')
{
var result = span.Slice(position, i - position).ToString();
position = i + 1;
if (ch == '\r' && position < span.Length && span[position] == '\n')
position++;
return result;
}
}
if (i > position)
{
var result = span.Slice(position, i - position).ToString();
position = i;
return result;
}
return null;
}
public override int ReadBlock(char[] buffer, int index, int count) => Read(buffer, index, count);
public override int ReadBlock(Span<char> buffer) => Read(buffer);
public override Task<String> ReadLineAsync() => Task.FromResult(ReadLine());
public override Task<String> ReadToEndAsync() => Task.FromResult(ReadToEnd());
public override Task<int> ReadBlockAsync(char[] buffer, int index, int count) => Task.FromResult(ReadBlock(buffer, index, count));
public override Task<int> ReadAsync(char[] buffer, int index, int count) => Task.FromResult(Read(buffer, index, count));
public override ValueTask<int> ReadBlockAsync(Memory<char> buffer, CancellationToken cancellationToken = default) =>
cancellationToken.IsCancellationRequested ? new ValueTask<int>(Task.FromCanceled<int>(cancellationToken)) : new ValueTask<int>(ReadBlock(buffer.Span));
public override ValueTask<int> ReadAsync(Memory<char> buffer, CancellationToken cancellationToken = default) =>
cancellationToken.IsCancellationRequested ? new ValueTask<int>(Task.FromCanceled<int>(cancellationToken)) : new ValueTask<int>(Read(buffer.Span));
}
然后将其与以下扩展方法之一一起使用:
public static partial class XmlSerializationHelper
{
public static T LoadFromXml<T>(this char [] xml, int contentLength, XmlSerializer serial = null) =>
new ReadOnlyMemory<char>(xml, 0, contentLength).LoadFromXml<T>(serial);
public static T LoadFromXml<T>(this ReadOnlyMemory<char> xml, XmlSerializer serial = null)
{
serial = serial ?? new XmlSerializer(typeof(T));
using (var reader = new CharMemoryReader(xml))
return (T)serial.Deserialize(reader);
}
}
例如
var result = buffer.LoadFromXml<MyEntity>(contentLength, _xmlSerializer);
备注:
一个
char []
字符数组与没有BOM, so one could create a customStream
implementation resemblingMemoryStream
that represents eachchar
as two bytes, as is done in this answer to How do I generate a stream from a string? by György Kőszeg的UTF-16编码内存流的内容基本相同。然而,完全正确地执行此操作看起来有点棘手,因为正确设置所有async
方法似乎很重要。完成后
XmlReader
仍需要使用StreamReader
将自定义流包装成 "decodes" 字符序列,正确推断过程中的编码(我观察到这有时可能会被错误地完成,例如当编码声明 XML 声明与实际编码不匹配时)。我选择创建自定义
TextReader
而不是自定义Stream
以避免不必要的解码步骤,并且因为async
实施似乎不那么麻烦。通过截断(例如
(byte)str[i]
)将每个char
表示为单个字节将损坏包含任何多字节字符的 XML。我没有对上面的实现做任何性能调优。
演示 fiddle here.