NimbleCSV:长生不老药

NimbleCSV : Elixir

我正在尝试将 NimbleCSV 库用于个人项目,但我遇到了一些问题...

NimbleCSV.define(MyParser, separator: ",", escape: "\"")

defmodule Siren do
  def parseCSV do
    IO.puts("Let's parse CSV file!")
    File.stream!("name.csv")
  |> MyParser.parse_stream
  |> Stream.map(fn [name, team, position, height, weight, age] ->
    %{name: name, team: team, position: position, height: String.to_integer(height), weight: String.to_integer(weight), age: String.to_integer(age)}
    end)
  |> Enum.map(&IO.puts(&1))
  end
end

正如您在上面看到的那样,我正在使用 Stream,但是当我启动 Mix 任务时它崩溃了:

➜  siren mix siren
Compiling 1 file (.ex)
Let's parse CSV file!
** (NimbleCSV.ParseError) unexpected escape character " in " \"Team\", \"Position\", \"Height(inches)\", \"Weight(lbs)\", \"Age\"\n"
    deps/nimble_csv/lib/nimble_csv.ex:427: MyParser.separator/5
    deps/nimble_csv/lib/nimble_csv.ex:360: anonymous fn/4 in MyParser.parse_stream/2
    (elixir 1.10.3) lib/stream.ex:902: Stream.do_transform_user/6
    (elixir 1.10.3) lib/stream.ex:1609: Enumerable.Stream.do_each/4
    (elixir 1.10.3) lib/enum.ex:3383: Enum.map/2
    (mix 1.10.3) lib/mix/task.ex:330: Mix.Task.run_task/3
    (mix 1.10.3) lib/mix/cli.ex:82: Mix.CLI.run_task/2

这是我的 CSV 文件:

"Name", "Team", "Position", "Height(inches)", "Weight(lbs)", "Age"
"Adam Donachie", "BAL", "Catcher", 74, 180, 22.99
"Paul Bako", "BAL", "Catcher", 74, 215, 34.69
"Ramon Hernandez", "BAL", "Catcher", 72, 210, 30.78
"Kevin Millar", "BAL", "First Baseman", 72, 210, 35.43
"Chris Gomez", "BAL", "First Baseman", 73, 188, 35.71
"Brian Roberts", "BAL", "Second Baseman", 69, 176, 29.39
"Miguel Tejada", "BAL", "Shortstop", 69, 209, 30.77
"Melvin Mora", "BAL", "Third Baseman", 71, 200, 35.07
"Aubrey Huff", "BAL", "Third Baseman", 76, 231, 30.19
"Adam Stern", "BAL", "Outfielder", 71, 180, 27.05
"Jeff Fiorentino", "BAL", "Outfielder", 73, 188, 23.88
"Freddie Bynum", "BAL", "Outfielder", 73, 180, 26.96
"Nick Markakis", "BAL", "Outfielder", 74, 185, 23.29
"Brandon Fahey", "BAL", "Outfielder", 74, 160, 26.11
"Corey Patterson", "BAL", "Outfielder", 69, 180, 27.55

问题一定出在我之前定义的转义字符,但我不明白为什么?这里的转义字符是什么?对我来说,CSV 行中每个字符串的双引号。

CSV 表示 逗号分隔值 并且它是具有自己的 RFC4180 的格式。人们不能随心所欲地放置空格。将输入更改为如下所示,一切正常。问题是 逗号后的空格 ,或者换句话说,转义字符没有紧跟在定界符之后。

"Name","Team","Position","Height(inches)","Weight(lbs)","Age"
"Adam Donachie","BAL","Catcher",74,180,22.99
"Paul Bako","BAL","Catcher",74,215,34.69
"Ramon Hernandez","BAL","Catcher",72,210,30.78
"Kevin Millar","BAL","First Baseman",72,210,35.43
"Chris Gomez","BAL","First Baseman",73,188,35.71
"Brian Roberts","BAL","Second Baseman",69,176,29.39
"Miguel Tejada","BAL","Shortstop",69,209,30.77
"Melvin Mora","BAL","Third Baseman",71,200,35.07
"Aubrey Huff","BAL","Third Baseman",76,231,30.19
"Adam Stern","BAL","Outfielder",71,180,27.05
"Jeff Fiorentino","BAL","Outfielder",73,188,23.88
"Freddie Bynum","BAL","Outfielder",73,180,26.96
"Nick Markakis","BAL","Outfielder",74,185,23.29
"Brandon Fahey","BAL","Outfielder",74,160,26.11
"Corey Patterson","BAL","Outfielder",69,180,27.55

NimbleCSV 带有默认实现,NimbleCSV.RFC4180 它正是您使用的,因此您不需要定义自己的解析器,使用默认的解析器。

defmodule Siren do
  def parseCSV do
    IO.puts("Let's parse CSV file!")

    File.stream!("name.csv")
    |> NimbleCSV.RFC4180.parse_stream()
    |> Stream.map(fn [name, team, position, height, weight, age] ->
      %{name: name, team: team, position: position,
        height: String.to_integer(height),
        weight: String.to_integer(weight),
        age: String.to_float(age) # NOTE float here!
      }
    end)
    |> Enum.to_list()
    |> IO.inspect()
  end
end
#⇒ [
#  %{
#    age: 22.99,
#    height: 74,
#    name: "Adam Donachie",
#    position: "Catcher",
#    team: "BAL",
#    weight: 180
#  },
#  ...
# ]