比较 CSV Header 与地图 Class
Compare CSV Header to Map Class
我有一个过程,据此我们编写了一个 class 以使用 CsvHelper (https://joshclose.github.io/CsvHelper) 将大型(大概)CSV 文件导入我们的应用程序。
我想将 header 与地图进行比较,以确保 header 的完整性。我们从第三方获取 CSV 文件,我想确保它不会随着时间的推移而改变,我认为最好的方法是将它与地图进行比较。
我们有一个 class 设置(修剪):
public class VisitExport
{
public int? Count { get; set; }
public string CustomerName { get; set; }
public string CustomerAddress { get; set; }
}
及其对应的贴图(也经过裁剪):
public class VisitMap : ClassMap<VisitExport>
{
public VisitMap()
{
Map(m => m.Count).Name("Count");
Map(m => m.CustomerName).Name("Customer Name");
Map(m => m.CustomerAddress).Name("Customer Address");
}
}
这是我用来读取 CSV 文件的代码,效果很好。我有一个 try catch 来解决这个错误,但理想情况下,如果它专门针对 header 未匹配而失败,我想专门处理它。
private void fileLoadedLink_LinkClicked(object sender, LinkLabelLinkClickedEventArgs e)
{
try
{
var filePath = string.Empty;
data = new List<VisitExport>();
using (OpenFileDialog openFileDialog = new OpenFileDialog())
{
openFileDialog.InitialDirectory = new KnownFolder(KnownFolderType.Downloads).Path;
openFileDialog.Filter = "csv files (*.csv)|*.csv";
openFileDialog.FilterIndex = 2;
openFileDialog.RestoreDirectory = true;
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
filePath = openFileDialog.FileName;
var fileStream = openFileDialog.OpenFile();
var culture = CultureInfo.GetCultureInfo("en-GB");
using (StreamReader reader = new StreamReader(fileStream))
using (var readCsv = new CsvReader(reader, culture))
{
var map = new VisitMap();
readCsv.Context.RegisterClassMap(map);
var fileContent = readCsv.GetRecords<VisitExport>();
data = fileContent.ToList();
fileLoadedLink.Text = filePath;
viewModel.IsFileLoaded = true;
}
}
}
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
}
有没有办法比较 Csv header 与我的地图?
先抓HeaderValidationException
再抓CsvHelperException
怎么样
catch (HeaderValidationException ex)
{
var message = ex.Message.Split('\n')[0];
var currentHeader = ex.Context.Reader.HeaderRecord;
message += $"{Environment.NewLine}Header: \"{string.Join(",", currentHeader)}\"";
Console.WriteLine(message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
具有 header 的 CSV 文件有两种基本情况:缺少 CSV 列和额外的 CSV 列。第一个已经被 CsvHelper
检测到,而第二个的检测不是开箱即用的,需要 CsvReader
.
的子类化
(由于 CsvHelper 按名称将 CSV 列映射到模型属性,因此排列 CSV 文件中列的顺序不会被视为重大更改。)
请注意,这仅适用于实际包含 header 的 CSV 文件。由于您没有设置 CsvConfiguration.HasHeaderRecord = false
我认为这适用于您的用例。
关于这两个案例的详细信息如下。
缺少 CSV 列。
目前 CsvHelper 已经 在这种情况下默认抛出异常。当找到未映射的数据模型属性时,CsvConfiguration.HeaderValidated
is invoked. By default this is set to ConfigurationFunctions.HeaderValidated
whose current behavior is to throw a HeaderValidationException
如果有任何未映射的模型属性。如果您愿意,可以用自己的逻辑替换或扩展 HeaderValidated
:
var culture = CultureInfo.GetCultureInfo("en-GB");
var config = new CsvConfiguration (culture)
{
HeaderValidated = (args) =>
{
// Add additional logic as required here
ConfigurationFunctions.HeaderValidated(args);
},
};
using (var readCsv = new CsvReader(reader, config))
{
// Remainder unchanged
演示 fiddle #1 here.
额外的 CSV 列。
目前 CsvHelper
不会在发生这种情况时通知应用程序。请参阅 Throw if csv contains unexpected columns #1032,它确认这不是开箱即用的。
在GitHub comment, user leopignataro suggests a workaround, which is to subclass CsvReader
and add the necessary validation logic oneself. However the version shown in the comment doesn't seem to handle duplicated column names or embedded references. The following subclass of CsvHelper
should do this correctly. It is based on the logic in CsvReader.ValidateHeader(ClassMap map, List<InvalidHeader> invalidHeaders)
。它递归地遍历传入的 ClassMap
,尝试找到对应于每个成员或构造函数参数的 CSV header,并标记每个映射的索引。之后,如果有任何未映射的 header,将调用提供的 Action<CsvContext, List<string>> OnUnmappedCsvHeaders
来通知应用程序出现问题并在需要时抛出一些异常:
public class ValidatingCsvReader : CsvReader
{
public ValidatingCsvReader(TextReader reader, CultureInfo culture, bool leaveOpen = false) : this(new CsvParser(reader, culture, leaveOpen)) { }
public ValidatingCsvReader(TextReader reader, CsvConfiguration configuration) : this(new CsvParser(reader, configuration)) { }
public ValidatingCsvReader(IParser parser) : base(parser) { }
public Action<CsvContext, List<string>> OnUnmappedCsvHeaders { get; set; }
public override void ValidateHeader(Type type)
{
base.ValidateHeader(type);
var headerRecord = HeaderRecord;
var mapped = new BitArray(headerRecord.Length);
var map = Context.Maps[type];
FlagMappedHeaders(map, mapped);
var unmappedHeaders = Enumerable.Range(0, headerRecord.Length).Where(i => !mapped[i]).Select(i => headerRecord[i]).ToList();
if (unmappedHeaders.Count > 0)
{
OnUnmappedCsvHeaders?.Invoke(Context, unmappedHeaders);
}
}
protected virtual void FlagMappedHeaders(ClassMap map, BitArray mapped)
{
// Logic adapted from https://github.com/JoshClose/CsvHelper/blob/0d753ff09294b425e4bc5ab346145702eeeb1b6f/src/CsvHelper/CsvReader.cs#L157
// By https://github.com/JoshClose
foreach (var parameter in map.ParameterMaps)
{
if (parameter.Data.Ignore)
continue;
if (parameter.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (parameter.Data.IsIndexSet && !parameter.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
if (parameter.ConstructorTypeMap != null)
{
FlagMappedHeaders(parameter.ConstructorTypeMap, mapped);
}
else if (parameter.ReferenceMap != null)
{
FlagMappedHeaders(parameter.ReferenceMap.Data.Mapping, mapped);
}
else
{
var index = GetFieldIndex(parameter.Data.Names.ToArray(), parameter.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
}
foreach (var memberMap in map.MemberMaps)
{
if (memberMap.Data.Ignore || !CanRead(memberMap))
continue;
if (memberMap.Data.ReadingConvertExpression != null || memberMap.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (memberMap.Data.IsIndexSet && !memberMap.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
var index = GetFieldIndex(memberMap.Data.Names.ToArray(), memberMap.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
foreach (var referenceMap in map.ReferenceMaps)
{
if (!CanRead(referenceMap))
continue;
FlagMappedHeaders(referenceMap.Data.Mapping, mapped);
}
}
}
然后在您的代码中,根据需要处理 OnUnmappedCsvHeaders
回调,例如抛出 CsvHelperException
或其他一些自定义异常:
using (var readCsv = new ValidatingCsvReader(reader, culture)
{
OnUnmappedCsvHeaders = (context, headers) => throw new CsvHelperException(context, string.Format("Unmapped CSV headers: \"{0}\"", string.Join(",", headers))),
})
演示 fiddles:
- #2 (your model).
- #3 (with external references).
- #4 (duplicate names).
- #5 (using the auto-generated map).
这可以使用额外的测试,例如对于具有参数化构造函数和附加可变属性的数据模型。
我有一个过程,据此我们编写了一个 class 以使用 CsvHelper (https://joshclose.github.io/CsvHelper) 将大型(大概)CSV 文件导入我们的应用程序。
我想将 header 与地图进行比较,以确保 header 的完整性。我们从第三方获取 CSV 文件,我想确保它不会随着时间的推移而改变,我认为最好的方法是将它与地图进行比较。
我们有一个 class 设置(修剪):
public class VisitExport
{
public int? Count { get; set; }
public string CustomerName { get; set; }
public string CustomerAddress { get; set; }
}
及其对应的贴图(也经过裁剪):
public class VisitMap : ClassMap<VisitExport>
{
public VisitMap()
{
Map(m => m.Count).Name("Count");
Map(m => m.CustomerName).Name("Customer Name");
Map(m => m.CustomerAddress).Name("Customer Address");
}
}
这是我用来读取 CSV 文件的代码,效果很好。我有一个 try catch 来解决这个错误,但理想情况下,如果它专门针对 header 未匹配而失败,我想专门处理它。
private void fileLoadedLink_LinkClicked(object sender, LinkLabelLinkClickedEventArgs e)
{
try
{
var filePath = string.Empty;
data = new List<VisitExport>();
using (OpenFileDialog openFileDialog = new OpenFileDialog())
{
openFileDialog.InitialDirectory = new KnownFolder(KnownFolderType.Downloads).Path;
openFileDialog.Filter = "csv files (*.csv)|*.csv";
openFileDialog.FilterIndex = 2;
openFileDialog.RestoreDirectory = true;
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
filePath = openFileDialog.FileName;
var fileStream = openFileDialog.OpenFile();
var culture = CultureInfo.GetCultureInfo("en-GB");
using (StreamReader reader = new StreamReader(fileStream))
using (var readCsv = new CsvReader(reader, culture))
{
var map = new VisitMap();
readCsv.Context.RegisterClassMap(map);
var fileContent = readCsv.GetRecords<VisitExport>();
data = fileContent.ToList();
fileLoadedLink.Text = filePath;
viewModel.IsFileLoaded = true;
}
}
}
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
}
有没有办法比较 Csv header 与我的地图?
先抓HeaderValidationException
再抓CsvHelperException
catch (HeaderValidationException ex)
{
var message = ex.Message.Split('\n')[0];
var currentHeader = ex.Context.Reader.HeaderRecord;
message += $"{Environment.NewLine}Header: \"{string.Join(",", currentHeader)}\"";
Console.WriteLine(message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
具有 header 的 CSV 文件有两种基本情况:缺少 CSV 列和额外的 CSV 列。第一个已经被 CsvHelper
检测到,而第二个的检测不是开箱即用的,需要 CsvReader
.
(由于 CsvHelper 按名称将 CSV 列映射到模型属性,因此排列 CSV 文件中列的顺序不会被视为重大更改。)
请注意,这仅适用于实际包含 header 的 CSV 文件。由于您没有设置 CsvConfiguration.HasHeaderRecord = false
我认为这适用于您的用例。
关于这两个案例的详细信息如下。
缺少 CSV 列。
目前 CsvHelper 已经 在这种情况下默认抛出异常。当找到未映射的数据模型属性时,CsvConfiguration.HeaderValidated
is invoked. By default this is set to ConfigurationFunctions.HeaderValidated
whose current behavior is to throw a HeaderValidationException
如果有任何未映射的模型属性。如果您愿意,可以用自己的逻辑替换或扩展 HeaderValidated
:
var culture = CultureInfo.GetCultureInfo("en-GB");
var config = new CsvConfiguration (culture)
{
HeaderValidated = (args) =>
{
// Add additional logic as required here
ConfigurationFunctions.HeaderValidated(args);
},
};
using (var readCsv = new CsvReader(reader, config))
{
// Remainder unchanged
演示 fiddle #1 here.
额外的 CSV 列。
目前 CsvHelper
不会在发生这种情况时通知应用程序。请参阅 Throw if csv contains unexpected columns #1032,它确认这不是开箱即用的。
在GitHub comment, user leopignataro suggests a workaround, which is to subclass CsvReader
and add the necessary validation logic oneself. However the version shown in the comment doesn't seem to handle duplicated column names or embedded references. The following subclass of CsvHelper
should do this correctly. It is based on the logic in CsvReader.ValidateHeader(ClassMap map, List<InvalidHeader> invalidHeaders)
。它递归地遍历传入的 ClassMap
,尝试找到对应于每个成员或构造函数参数的 CSV header,并标记每个映射的索引。之后,如果有任何未映射的 header,将调用提供的 Action<CsvContext, List<string>> OnUnmappedCsvHeaders
来通知应用程序出现问题并在需要时抛出一些异常:
public class ValidatingCsvReader : CsvReader
{
public ValidatingCsvReader(TextReader reader, CultureInfo culture, bool leaveOpen = false) : this(new CsvParser(reader, culture, leaveOpen)) { }
public ValidatingCsvReader(TextReader reader, CsvConfiguration configuration) : this(new CsvParser(reader, configuration)) { }
public ValidatingCsvReader(IParser parser) : base(parser) { }
public Action<CsvContext, List<string>> OnUnmappedCsvHeaders { get; set; }
public override void ValidateHeader(Type type)
{
base.ValidateHeader(type);
var headerRecord = HeaderRecord;
var mapped = new BitArray(headerRecord.Length);
var map = Context.Maps[type];
FlagMappedHeaders(map, mapped);
var unmappedHeaders = Enumerable.Range(0, headerRecord.Length).Where(i => !mapped[i]).Select(i => headerRecord[i]).ToList();
if (unmappedHeaders.Count > 0)
{
OnUnmappedCsvHeaders?.Invoke(Context, unmappedHeaders);
}
}
protected virtual void FlagMappedHeaders(ClassMap map, BitArray mapped)
{
// Logic adapted from https://github.com/JoshClose/CsvHelper/blob/0d753ff09294b425e4bc5ab346145702eeeb1b6f/src/CsvHelper/CsvReader.cs#L157
// By https://github.com/JoshClose
foreach (var parameter in map.ParameterMaps)
{
if (parameter.Data.Ignore)
continue;
if (parameter.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (parameter.Data.IsIndexSet && !parameter.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
if (parameter.ConstructorTypeMap != null)
{
FlagMappedHeaders(parameter.ConstructorTypeMap, mapped);
}
else if (parameter.ReferenceMap != null)
{
FlagMappedHeaders(parameter.ReferenceMap.Data.Mapping, mapped);
}
else
{
var index = GetFieldIndex(parameter.Data.Names.ToArray(), parameter.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
}
foreach (var memberMap in map.MemberMaps)
{
if (memberMap.Data.Ignore || !CanRead(memberMap))
continue;
if (memberMap.Data.ReadingConvertExpression != null || memberMap.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (memberMap.Data.IsIndexSet && !memberMap.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
var index = GetFieldIndex(memberMap.Data.Names.ToArray(), memberMap.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
foreach (var referenceMap in map.ReferenceMaps)
{
if (!CanRead(referenceMap))
continue;
FlagMappedHeaders(referenceMap.Data.Mapping, mapped);
}
}
}
然后在您的代码中,根据需要处理 OnUnmappedCsvHeaders
回调,例如抛出 CsvHelperException
或其他一些自定义异常:
using (var readCsv = new ValidatingCsvReader(reader, culture)
{
OnUnmappedCsvHeaders = (context, headers) => throw new CsvHelperException(context, string.Format("Unmapped CSV headers: \"{0}\"", string.Join(",", headers))),
})
演示 fiddles:
- #2 (your model).
- #3 (with external references).
- #4 (duplicate names).
- #5 (using the auto-generated map).
这可以使用额外的测试,例如对于具有参数化构造函数和附加可变属性的数据模型。