解析字段中具有 Space 的 Space 个分隔文件

Parse Space Delimited Files With Spaces In Fields

我有一个使用 space 作为分隔符的 CSV 文件。但是有些字段包含 space 并且这些字段用双引号引起来。任何具有 null/empty 值的字段都表示为“-”。不是 null/empty 且不包含 space 的字段不会用双引号引起来。这是 CSV 文件中一行的示例。

foobar "foo bar" "-" "-" "-" fizzbuzz "fizz buzz" fizz buzz

CSV 文件也没有 headers。我打算使用一个简单的解决方案,例如这个 but using strings.Split(csvInput, " ") wouldn't handle the spaces inside the fields. I've also looked into this library https://github.com/gocarina/gocsv,但我很好奇是否有不使用 third-party 库的解决方案。

这是 "plain" CSV 格式,其中分隔符是 space 字符而不是逗号或分号。 encoding/csv 包可以处理这个问题。

至于您的 null/空字段:只需使用循环作为 post 处理步骤并将它们替换为空字符串。

使用输入:

const input = `foobar "foo bar" "-" "-" "-" fizzbuzz "fizz buzz" fizz buzz
f2 "fo ba" "-" "-" "-" fd "f b" f b`

解析和post-处理它:

r := csv.NewReader(strings.NewReader(input))
r.Comma = ' '
records, err := r.ReadAll()
if err != nil {
    panic(err)
}
fmt.Printf("%#v\n", records)

for _, r := range records {
    for i, v := range r {
        if v == "-" {
            r[i] = ""
        }
    }
}
fmt.Printf("%#v\n", records)

输出(在 Go Playground 上尝试):

[][]string{[]string{"foobar", "foo bar", "-", "-", "-", "fizzbuzz", "fizz buzz", "fizz", "buzz"}, []string{"f2", "fo ba", "-", "-", "-", "fd", "f b", "f", "b"}}
[][]string{[]string{"foobar", "foo bar", "", "", "", "fizzbuzz", "fizz buzz", "fizz", "buzz"}, []string{"f2", "fo ba", "", "", "", "fd", "f b", "f", "b"}}