在 luajit 中慢 FFI.cast

Slow FFI.cast in luajit

您能否在以下代码段中解释 FFI.cast 的低性能?

prof = require 'profile'

local ffi = require("ffi")

ffi.cdef[[
struct message {
    int field_a;
};

]]

function cast_test1()
   bytes = ffi.new("char[100000000]")

   sum = 0
   t1 = prof.rdtsc()
   for i=1,1000000 do
      sum = sum + i
   end
   t2 = prof.rdtsc()

   print("test1", tonumber(t2-t1))
end

function cast_test2()
   bytes = ffi.new("char[100000000]")

   sum = 0
   t1 = prof.rdtsc()
   for i=1,1000000 do
      sum = sum + i
      msg = ffi.cast("struct message *", bytes+ i * 16)
--      msg.field_a = i
   end
   t2 = prof.rdtsc()

   print("test2", tonumber(t2-t1))
end

cast_test1()
cast_test2()

看起来带有转换的循环运行速度慢了大约 30 倍。有什么办法可以克服这个问题吗?

% luajit -v  cast_tests.lua
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall. http://luajit.org/
test1   3227528
test2   94474000

看起来全局 msg 变量是罪魁祸首。用 local 替换它可以提供 20 倍的加速:)

它与 lualit-2.0.3 和 lualit-2.1 都相关

function cast_test3()
   local bytes = ffi.new("char[100000000]")
   local sum = 0
   local t1 = prof.rdtsc()
   for i=1,1000000 do
      sum = sum + i
      local msg = ffi.cast("struct message *", bytes+ i * 4)
      msg.field_a = i
   end
   local t2 = prof.rdtsc()
   local sum2 = 0
   for i=1,1000000 do
      local msg = ffi.cast("struct message *", bytes+ i * 4)
      sum2 = sum2 + msg.field_a
   end

   local t3 = prof.rdtsc()
   print(sum, sum2)
   print("test3", tonumber(t2-t1), tonumber(t3-t2))
end

cast_test3()

结果:

% /usr/bin/luajit -v    cast_tests.lua           ~/Projects/lua_tests/lua_rdtsc 
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall. http://luajit.org/
500000500000    500000500000
test3   4502508 4850884