Lua - 在连字符处拆分 CSV 列
Lua - Split CSV Column at Hyphen
我想使用 Lua CSV http://lua-users.org/wiki/LuaCsv
并且需要将其中一个使用连字符“-”的列拆分,例如:First-Surname 到它创建的 table 中。我可以在 excel 中使用 Data > Text to Columns 手动完成,这将拆分 Full-Name 单元格并在末尾添加 First 和 Surname 列,我需要 Lua 在脚本:
Before
Age,Name,Start,End,Length,Score
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056
After
Age,Name,Start,End,Length,Score,First,Surname
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith
这是要使用的 Lua csv 解析器:
function ParseCSVLine (line,sep)
local res = {}
local pos = 1
sep = sep or ','
while true do
local c = string.sub(line,pos,pos)
if (c == "") then break end
if (c == '"') then
-- quoted value (ignore separator within)
local txt = ""
repeat
local startp,endp = string.find(line,'^%b""',pos)
txt = txt..string.sub(line,startp+1,endp-1)
pos = endp + 1
c = string.sub(line,pos,pos)
if (c == '"') then txt = txt..'"' end
-- check first char AFTER quoted string, if it is another
-- quoted string without separator, then append it
-- this is the way to "escape" the quote char in a quote. example:
-- value1,"blub""blip""boing",value3 will result in blub"blip"boing for the middle
until (c ~= '"')
table.insert(res,txt)
assert(c == sep or c == "")
pos = pos + 1
else
-- no quotes used, just look for the first separator
local startp,endp = string.find(line,sep,pos)
if (startp) then
table.insert(res,string.sub(line,pos,startp-1))
pos = endp + 1
else
-- no separator found -> use rest of string and terminate
table.insert(res,string.sub(line,pos))
break
end
end
end
return res
end
假设每行的第二个值始终是名称,并且始终包含第一个-最后一个连字符分隔符作为名称中的第一个或唯一一个连字符(不一定是安全的假设),那么您可以读取每一行从你的输入(我使用了一个字符串,但这可能是一个 IO reader 或者你正在使用的任何东西),使用模式将第二个值解析为名字和姓氏,将它们附加到读取行,然后将其写回(我将其插入 table 此处进行演示):
local input = [[
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057
]]
local output = {}
for line in string.gmatch (input, '[^04]+') do --get next group of letters up to the next CR or LF character
local name = string.match (line, '^.-,(.-),') -- assuming that the name field is always the second value here
local first, last = string.match (name or '', '^(.-)%-(.+)$')
line = line .. ',' .. (first or '') .. ',' .. (last or name or '')
table.insert (output, line)
end
print (table.concat (output, '\r\n'))
>>>>>>>
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058,Ben,Smith
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057,Bob,Smith
我想使用 Lua CSV http://lua-users.org/wiki/LuaCsv 并且需要将其中一个使用连字符“-”的列拆分,例如:First-Surname 到它创建的 table 中。我可以在 excel 中使用 Data > Text to Columns 手动完成,这将拆分 Full-Name 单元格并在末尾添加 First 和 Surname 列,我需要 Lua 在脚本:
Before
Age,Name,Start,End,Length,Score
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056
After
Age,Name,Start,End,Length,Score,First,Surname
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith
这是要使用的 Lua csv 解析器:
function ParseCSVLine (line,sep)
local res = {}
local pos = 1
sep = sep or ','
while true do
local c = string.sub(line,pos,pos)
if (c == "") then break end
if (c == '"') then
-- quoted value (ignore separator within)
local txt = ""
repeat
local startp,endp = string.find(line,'^%b""',pos)
txt = txt..string.sub(line,startp+1,endp-1)
pos = endp + 1
c = string.sub(line,pos,pos)
if (c == '"') then txt = txt..'"' end
-- check first char AFTER quoted string, if it is another
-- quoted string without separator, then append it
-- this is the way to "escape" the quote char in a quote. example:
-- value1,"blub""blip""boing",value3 will result in blub"blip"boing for the middle
until (c ~= '"')
table.insert(res,txt)
assert(c == sep or c == "")
pos = pos + 1
else
-- no quotes used, just look for the first separator
local startp,endp = string.find(line,sep,pos)
if (startp) then
table.insert(res,string.sub(line,pos,startp-1))
pos = endp + 1
else
-- no separator found -> use rest of string and terminate
table.insert(res,string.sub(line,pos))
break
end
end
end
return res
end
假设每行的第二个值始终是名称,并且始终包含第一个-最后一个连字符分隔符作为名称中的第一个或唯一一个连字符(不一定是安全的假设),那么您可以读取每一行从你的输入(我使用了一个字符串,但这可能是一个 IO reader 或者你正在使用的任何东西),使用模式将第二个值解析为名字和姓氏,将它们附加到读取行,然后将其写回(我将其插入 table 此处进行演示):
local input = [[
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057
]]
local output = {}
for line in string.gmatch (input, '[^04]+') do --get next group of letters up to the next CR or LF character
local name = string.match (line, '^.-,(.-),') -- assuming that the name field is always the second value here
local first, last = string.match (name or '', '^(.-)%-(.+)$')
line = line .. ',' .. (first or '') .. ',' .. (last or name or '')
table.insert (output, line)
end
print (table.concat (output, '\r\n'))
>>>>>>>
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058,Ben,Smith
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057,Bob,Smith