Lua - 在连字符处拆分 CSV 列

Lua - Split CSV Column at Hyphen

我想使用 Lua CSV http://lua-users.org/wiki/LuaCsv 并且需要将其中一个使用连字符“-”的列拆分,例如:First-Surname 到它创建的 table 中。我可以在 excel 中使用 Data > Text to Columns 手动完成,这将拆分 Full-Name 单元格并在末尾添加 First 和 Surname 列,我需要 Lua 在脚本:

Before
Age,Name,Start,End,Length,Score
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056

After
Age,Name,Start,End,Length,Score,First,Surname
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith

这是要使用的 Lua csv 解析器:

function ParseCSVLine (line,sep) 
    local res = {}
    local pos = 1
    sep = sep or ','
    while true do 
        local c = string.sub(line,pos,pos)
        if (c == "") then break end
        if (c == '"') then
            -- quoted value (ignore separator within)
            local txt = ""
            repeat
                local startp,endp = string.find(line,'^%b""',pos)
                txt = txt..string.sub(line,startp+1,endp-1)
                pos = endp + 1
                c = string.sub(line,pos,pos) 
                if (c == '"') then txt = txt..'"' end 
                -- check first char AFTER quoted string, if it is another
                -- quoted string without separator, then append it
                -- this is the way to "escape" the quote char in a quote. example:
                --   value1,"blub""blip""boing",value3  will result in blub"blip"boing  for the middle
            until (c ~= '"')
            table.insert(res,txt)
            assert(c == sep or c == "")
            pos = pos + 1
        else    
            -- no quotes used, just look for the first separator
            local startp,endp = string.find(line,sep,pos)
            if (startp) then 
                table.insert(res,string.sub(line,pos,startp-1))
                pos = endp + 1
            else
                -- no separator found -> use rest of string and terminate
                table.insert(res,string.sub(line,pos))
                break
            end 
        end
    end
    return res
end

假设每行的第二个值始终是名称,并且始终包含第一个-最后一个连字符分隔符作为名称中的第一个或唯一一个连字符(不一定是安全的假设),那么您可以读取每一行从你的输入(我使用了一个字符串,但这可能是一个 IO reader 或者你正在使用的任何东西),使用模式将第二个值解析为名字和姓氏,将它们附加到读取行,然后将其写回(我将其插入 table 此处进行演示):

local input = [[
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057
]]

local output = {}

for line in string.gmatch (input, '[^04]+') do --get next group of letters up to the next CR or LF character
    local name = string.match (line, '^.-,(.-),') -- assuming that the name field is always the second value here
    local first, last = string.match (name or '', '^(.-)%-(.+)$')
    line = line .. ',' .. (first or '') .. ',' .. (last or name or '')
    table.insert (output, line)
end

print (table.concat (output, '\r\n'))

>>>>>>>
35,Bill-Smith,2.2.2017,2.4.2017,0.2.00,2056,Bill,Smith
31,Ben-Smith,2.4.2015,2.6.2012,0.2.01,2058,Ben,Smith
32,Bob-Smith,2.3.2016,2.7.2011,0.2.02,2057,Bob,Smith