自定义分隔符在 CsvHelper 中不起作用
Custom delimiter doesn't work in CsvHelper
我正在使用 CsvHelper v26.1.0 读取以下由 ~
分隔的文本文件:
123~John
234~Joe "Public"
但是文件中的双引号导致 CsvHelper 将它们视为错误数据。我通过删除双引号对其进行了测试,并且效果很好。但问题是,我已经设置了自定义分隔符,为什么双引号仍然导致问题?
public class AccountDtoMap : ClassMap<AccountDto>
{
public AccountDtoMap()
{
Map(m => m.Number).Index(0);
Map(m => m.Name).Index(1);
}
}
var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "~",
HasHeaderRecord = false,
MissingFieldFound = (context) => { errs.Add($"{typeof(T)} missing field: {context.Context.Parser.RawRecord}"); },
BadDataFound = (context) => { errs.Add($"{typeof(T)} bad data: {context.RawRecord}"); },
};
using (var csv = new CsvReader(new StreamReader(file), cfg))
{
csv.Context.RegisterClassMap<AccountDtoMap>();
return csv.GetRecords<T>().ToList();
}
可运行演示 here。
解析问题中显示的 CSV(在版本 26.1.0), you need to properly configure all of the following CsvConfiguration
设置中,而不仅仅是分隔符:
Delimiter
。用于在单个 CSV 行中 分隔 字段的字符。 (通常是,
,这里是~
)。
Escape
,默认"
。该字符用于先于一些其他需要转义的字符。
Quote
, default "
. The character used to wrap a field that needs quotes at the beginning and end as per RFC4180.
-
上面前三个字符设置的作用在CsvMode
enum的注释中解释:
public enum CsvMode
{
/// Uses RFC 4180 format (default).
/// If a field contains a CsvConfiguration.Delimiter or CsvConfiguration.NewLine,
/// it is wrapped in CsvConfiguration.Quote's.
/// If quoted field contains a CsvConfiguration.Quote, it is preceded by CsvConfiguration.Escape.
RFC4180 = 0,
/// Uses escapes.
/// If a field contains a CsvConfiguration.Delimiter, CsvConfiguration.NewLine,
/// or CsvConfiguration.Escape, it is preceded by CsvConfiguration.Escape.
/// Newline defaults to \n.
Escape,
/// <summary>
/// Doesn't use quotes or escapes.
/// This will ignore quoting and escape characters. This means a field cannot contain a
/// CsvConfiguration.Delimiter, CsvConfiguration.Quote, or
/// CsvConfiguration.NewLine, as they cannot be escaped.
NoEscape
}
字段Joe "Public"
包含本身未转义的内嵌转义字符,导致CshHelper报错。为了避免错误,您有几种可能的选择,包括:
设置 CsvMode.NoEscape
以完全禁用转义和引用:
var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture)
{
Mode = CsvMode.NoEscape,
// Remainder unchanged.
当然,如果您这样做,您的 CSV 文件不能包含字段中嵌入的分隔符或换行符。
演示 fiddle #1 here.
设置 Mode = CsvMode.Escape
以禁用引号中的字段换行,并将 Escape
设置为一些其他字符,例如 \
或 \t
没想到在实践中遇到的文件:
var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture)
{
Mode = CsvMode.Escape,
Escape = '\',
// Remainder unchanged.
即使您这样做,CSV 字段中的定界符、转义符和换行符仍必须使用选定的转义符正确转义。
演示 fiddle #2 here.
设置 Mode = CsvMode.Escape
并修复您的文件以正确转义转义字符:
234~Joe ""Public""
演示 fiddle #3 here.
我正在使用 CsvHelper v26.1.0 读取以下由 ~
分隔的文本文件:
123~John
234~Joe "Public"
但是文件中的双引号导致 CsvHelper 将它们视为错误数据。我通过删除双引号对其进行了测试,并且效果很好。但问题是,我已经设置了自定义分隔符,为什么双引号仍然导致问题?
public class AccountDtoMap : ClassMap<AccountDto>
{
public AccountDtoMap()
{
Map(m => m.Number).Index(0);
Map(m => m.Name).Index(1);
}
}
var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "~",
HasHeaderRecord = false,
MissingFieldFound = (context) => { errs.Add($"{typeof(T)} missing field: {context.Context.Parser.RawRecord}"); },
BadDataFound = (context) => { errs.Add($"{typeof(T)} bad data: {context.RawRecord}"); },
};
using (var csv = new CsvReader(new StreamReader(file), cfg))
{
csv.Context.RegisterClassMap<AccountDtoMap>();
return csv.GetRecords<T>().ToList();
}
可运行演示 here。
解析问题中显示的 CSV(在版本 26.1.0), you need to properly configure all of the following CsvConfiguration
设置中,而不仅仅是分隔符:
Delimiter
。用于在单个 CSV 行中 分隔 字段的字符。 (通常是,
,这里是~
)。Escape
,默认"
。该字符用于先于一些其他需要转义的字符。Quote
, default"
. The character used to wrap a field that needs quotes at the beginning and end as per RFC4180.
上面前三个字符设置的作用在CsvMode
enum的注释中解释:
public enum CsvMode
{
/// Uses RFC 4180 format (default).
/// If a field contains a CsvConfiguration.Delimiter or CsvConfiguration.NewLine,
/// it is wrapped in CsvConfiguration.Quote's.
/// If quoted field contains a CsvConfiguration.Quote, it is preceded by CsvConfiguration.Escape.
RFC4180 = 0,
/// Uses escapes.
/// If a field contains a CsvConfiguration.Delimiter, CsvConfiguration.NewLine,
/// or CsvConfiguration.Escape, it is preceded by CsvConfiguration.Escape.
/// Newline defaults to \n.
Escape,
/// <summary>
/// Doesn't use quotes or escapes.
/// This will ignore quoting and escape characters. This means a field cannot contain a
/// CsvConfiguration.Delimiter, CsvConfiguration.Quote, or
/// CsvConfiguration.NewLine, as they cannot be escaped.
NoEscape
}
字段Joe "Public"
包含本身未转义的内嵌转义字符,导致CshHelper报错。为了避免错误,您有几种可能的选择,包括:
设置
CsvMode.NoEscape
以完全禁用转义和引用:var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture) { Mode = CsvMode.NoEscape, // Remainder unchanged.
当然,如果您这样做,您的 CSV 文件不能包含字段中嵌入的分隔符或换行符。
演示 fiddle #1 here.
设置
Mode = CsvMode.Escape
以禁用引号中的字段换行,并将Escape
设置为一些其他字符,例如\
或\t
没想到在实践中遇到的文件:var cfg = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture) { Mode = CsvMode.Escape, Escape = '\', // Remainder unchanged.
即使您这样做,CSV 字段中的定界符、转义符和换行符仍必须使用选定的转义符正确转义。
演示 fiddle #2 here.
设置
Mode = CsvMode.Escape
并修复您的文件以正确转义转义字符:234~Joe ""Public""
演示 fiddle #3 here.