使用 FileHelper 解析带有可选列的 csv

Parsing csv with optional columns using FileHelper

我正在使用 Filehelper 3.1.5 来解析 CSV 文件,但我的问题是 CSV 文件应该支持许多可选列,我还没有找到为此任务配置 FileHelper。

这是一个例子:

[DelimitedRecord(";")]
[IgnoreEmptyLines]
public class TestRecord
{
    //Mandatory
    [FieldNotEmpty]
    public string A;

    [FieldOptional]
    public string B;

    [FieldOptional]
    public string C;
}

我希望能够像这样处理数据:

A;C
TestA1;TestC1
TestA2;TestC1

但是当我解析它时,我会得到 "TestC1" 作为 records[1].B

的结果
var engine = new FileHelperEngine<TestRecord>();
var records = engine.ReadFile("TestAC.csv");

string column = records[1].C;
Assert.IsTrue(column.Equals("TestC1"));  //Fails, returns ""

column = records[1].B;
Assert.IsTrue(column.Equals("TestC1"));  //True, but that was not what I wanted

感谢任何建议!

我认为你应该用标题装饰你的专栏,例如:

[DelimitedRecord(";")]
[IgnoreEmptyLines]
public class TestRecord
{
    //Mandatory
    [FieldNotEmpty, FieldOrder(0), FieldTitle("A")]
    public string A;

    [FieldOptional, FieldOrder(1), FieldTitle("B")]
    public string B;

    [FieldOptional, FieldOrder(2), FieldTitle("C")]
    public string C;
}

这样,运行时就知道列名是什么,并将相应地解析它们。否则,它只知道文件中有两列并且需要额外的 semi-colons。因此,以下内容适用于您的原始设置:

A;;C TestA1;;TestC1 TestA2;;TestC1

这仅适用于 FileHelpers v2,因为 v3 不再具有 FieldTitle

已针对文件助手版本 3.2.5 进行测试

为了使 FileHelper.Engine 正确识别您的列,您必须动态删除不再使用的字段。以下是基于您的代码,添加了一些位和来自控制台程序的 运行:

        string tempFile = System.IO.Path.GetTempFileName();
        System.IO.File.WriteAllText(tempFile, @"A;C\r\n\TestA1;TestC1\r\nTestA2;TestC1");
        var engine = new FileHelperEngine<TestRecord>();
        var records = engine.ReadFile(tempFile, 1);

        // Get the header text from the file
        var headerFile = engine.HeaderText.Replace("\r", "").Replace("\n", "");

        // Get the header from the engine record layout
        var headerFields = engine.GetFileHeader();

        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        string column = records[0].C;
        Debug.Assert("TestC1".Equals(column), "Test 1 - Column C does not equal 'TestC1'");  //Fails, returns ""

        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records[0].B;
        Debug.Assert(!"TestC1".Equals(column), "Test 1 - Column B does equal 'TestC1'");  //True, but that was not what I wanted

        // Create a new engine otherwise we get some random error from Dynamic.Assign once we start removing fields
        // which is presumably because we have called ReadFile() before hand.
        engine = new FileHelperEngine<TestRecord>();

        if (headerFile != headerFields)
        {
            var fieldHeaders = engine.Options.FieldsNames;
            var fileHeaders = headerFile.Split(';').ToList();

            // Loop through all the record layout fields and remove those not found in the file header
            for (int index = fieldHeaders.Length - 1; index >= 0; index--)
                if (!fileHeaders.Contains(fieldHeaders[index]))
                    engine.Options.RemoveField(fieldHeaders[index]);
        }

        headerFields = engine.GetFileHeader();
        Debug.Assert(headerFile == headerFields);

        var records2 = engine.ReadFile(tempFile);

        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records2[0].C;
        Debug.Assert("TestC1".Equals(column), "Test 2 - Column C does not equal 'TestC1'");  //Fails, returns ""

        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records2[0].B;
        Debug.Assert(!"TestC1".Equals(column), "Test 2 - Column B does equal 'TestC1'");  //True, but that was not what I wanted

        Console.WriteLine("Seems to be OK now!");
        Console.ReadLine();

注意:我发现一个重要的事情是,在当前的3.2.5版本中,在读取文件的第一行后删除一个字段会导致引擎烧断保险丝!

我还向您的 class 添加了一个 IgnoreFirst() 属性,以便它跳过 header 行并将忽略的文本设置为 engine.HeaderText。这导致以下 class:

    [DelimitedRecord(";")]
    [IgnoreEmptyLines]
    [IgnoreFirst()]
    public class TestRecord
    {
        //Mandatory
        [FieldNotEmpty]
        public string A;

        [FieldOptional]
        public string B;

        [FieldOptional]
        public string C;
    }