Lua 将字符串拆分为 table

Question

我正在寻找将 Lua 字符串拆分为 table 的最有效方法。

我使用 gmatch 或 gsub 找到了两种可能的方法，并试图让它们尽可能快。

function string:split1(sep)
    local sep = sep or ","
    local result = {}
    local i = 1
    for c in (self..sep):gmatch("(.-)"..sep) do
        result[i] = c
        i = i + 1
    end
    return result
end

function string:split2(sep)
   local sep = sep or ","
   local result = {}
   local pattern = string.format("([^%s]+)", sep)
   local i = 1
   self:gsub(pattern, function (c) result[i] = c i = i + 1 end)
   return result
end

第二个选项比第一个选项耗时约 50%。

什么是正确的方法，为什么？

已添加：我添加了具有相同模式的第三个函数。它显示了最好的结果。

function string:split3(sep)
    local sep = sep or ","
    local result = {}
    local i = 1
    for c in self:gmatch(string.format("([^%s]+)", sep)) do
        result[i] = c
        i = i + 1
    end
    return result
end

"(.-)"..sep - 使用序列。

"([^" .. sep .. "]+)" 适用于单个字符。事实上，对于序列中的每个字符。

string.format("([^%s]+)", sep) 比 "([^" .. sep .. "]+)".

快

string.format("(.-)%s", sep) 与 "(.-)"..sep 显示的时间几乎相同。

result[i]=c i=i+1 比 result[#result+1]=c 和 table.insert(result,c)

快

测试代码：

local init = os.clock()
local initialString = [[1,2,3,"afasdaca",4,"acaac"]]
local temTable = {}
for i = 1, 1000 do
    table.insert(temTable, initialString)
end
local dataString = table.concat(temTable,",")
print("Creating data: ".. (os.clock() - init))
    
init = os.clock()
local data1 = {}
for i = 1, 1000 do
    data1 = dataString:split1(",")
end
print("split1: ".. (os.clock() - init))

init = os.clock()
local data2 = {}
for i = 1, 1000 do
    data2 = dataString:split2(",")
end
print("split2: ".. (os.clock() - init))

init = os.clock()
local data3 = {}
for i = 1, 1000 do
    data3 = dataString:split3(",")
end
print("split3: ".. (os.clock() - init))

次数：

Creating data: 0.000229
split1: 1.189397
split2: 1.647402
split3: 1.011056

Answer 1

首选gmatch版本。 gsub 用于“全局替换”——字符串替换——而不是迭代匹配；因此它可能需要做更多的工作。

比较不太公平，因为您的模式不同：对于 gmatch，您使用 "(.-)"..sep，对于 gsub，您使用 "([^" .. sep .. "]+)"。为什么不对两者使用相同的模式？在较新的 Lua 版本中，您甚至可以使用边界模式。

不同的模式也会导致不同的行为：基于 gmatch 的函数将 return 空匹配，而其他则不会。请注意，"([^" .. sep .. "]+)" 模式允许您省略括号。

Lua 将字符串拆分为 table

Lua split string to table

string

lua

split

lua-table