从 lua 中的很长的字符串中获取随机模式匹配的最快方法是什么?
What's the fastest way to get a random pattern match from a very long string in lua?
我有一个包含超过 200 万个字符的字符串,我觉得我目前从模式中找到随机匹配的方法并不快。
local function getRandomMatch(string, pattern)
local occurenceCount = select(2, string.gsub(string, pattern, ""))
local index, randomIndex = 0, math.random(1, occurenceCount)
for match in string:gmatch(pattern) do
index = index + 1
if index == randomIndex then
return match
end
end
end
有什么方法可以更快吗?
local find, random, match = string.find, math.random, string.match
local function getRandomMatch(string, pattern)
local pos, random_pos = 0, 0
for cnt = 1, math.huge do
pos = find(string, pattern, pos + 1)
if not pos then
return match(string, pattern, random_pos)
elseif random(cnt) == 1 then
random_pos = pos
end
end
end
for j = 1, 20 do
print(getRandomMatch("1234", "%d%d"))
end
更新:
快速而简单的解决方案:
("Dirty" 表示 "matches are random but chosen with non-equal probabilities")
local random, match = math.random, string.match
local function getRandomMatchFastAndDirty(string, pattern)
return match(string, pattern, random(#string)) or match(string, pattern)
end
我有一个包含超过 200 万个字符的字符串,我觉得我目前从模式中找到随机匹配的方法并不快。
local function getRandomMatch(string, pattern)
local occurenceCount = select(2, string.gsub(string, pattern, ""))
local index, randomIndex = 0, math.random(1, occurenceCount)
for match in string:gmatch(pattern) do
index = index + 1
if index == randomIndex then
return match
end
end
end
有什么方法可以更快吗?
local find, random, match = string.find, math.random, string.match
local function getRandomMatch(string, pattern)
local pos, random_pos = 0, 0
for cnt = 1, math.huge do
pos = find(string, pattern, pos + 1)
if not pos then
return match(string, pattern, random_pos)
elseif random(cnt) == 1 then
random_pos = pos
end
end
end
for j = 1, 20 do
print(getRandomMatch("1234", "%d%d"))
end
更新:
快速而简单的解决方案:
("Dirty" 表示 "matches are random but chosen with non-equal probabilities")
local random, match = math.random, string.match
local function getRandomMatchFastAndDirty(string, pattern)
return match(string, pattern, random(#string)) or match(string, pattern)
end