Ruby 方法定义中的 Splat 运算符占用更多内存
Ruby Splat operator in method definition takes more memory
在对我们的代码库进行优化时,我们尝试使用 bang 方法来减少有意义的对象分配,但我们在基准测试中观察到分配的对象数量减少了,但总体 memsize 增加了。
复制脚本:
# frozen_string_literal: true
require 'bundler/inline'
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
gem 'benchmark-memory', '0.1.2'
end
require 'benchmark/memory'
def with_bang(*methods)
methods.tap(&:flatten!)
end
def without_bang(*methods)
methods.flatten
end
Benchmark.memory do |x|
x.report("with_bang") { with_bang(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.report("without_bang") { without_bang(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.compare!
end
# Output
# Ruby version: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin19]
# INPUT: (:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_bang 160.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_bang 80.000 memsize ( 0.000 retained)
# 2.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# without_bang: 80 allocated
# with_bang: 160 allocated - 2.00x more
# INPUT: (:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_bang 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_bang 480.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# with_bang: 240 allocated
# without_bang: 480 allocated - 2.00x more
在我的实验中,我认为这是由于 splat 运算符转换为数组所致。以下是暗示我得出这个结论的脚本。
# frozen_string_literal: true
require 'bundler/inline'
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
gem 'benchmark-memory', '0.1.2'
end
require 'benchmark/memory'
def with_splat(*methods)
methods.flatten!
end
def without_splat
methods = [:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o]
methods.flatten!
end
Benchmark.memory do |x|
x.report("with_splat") { with_splat(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.report("without_splat") { without_splat }
x.compare!
end
# Output
# Ruby version: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin19]
# INPUT: (:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_splat 160.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_splat 40.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# without_splat: 40 allocated
# with_splat: 160 allocated - 4.00x more
# INPUT: (:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_splat 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_splat 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# with_splat: 240 allocated
# without_splat: 240 allocated - same
我缺少什么来理解这种行为?为什么它会以这种方式运行?
谢谢!
编辑:
我已将新输入添加到包含嵌套数组的基准比较中。有了新的输入,我们看到了与之前基准测试不同的结果,我更加困惑了!
让我们更仔细地检查这两个数组:
require 'objspace'
def with_splat(*methods)
ObjectSpace.dump(methods, output: open('with_splat.json', 'w'))
end
def without_splat(methods)
ObjectSpace.dump(methods, output: open('without_splat.json', 'w'))
end
with_splat(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
without_splat([:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o])
ObjectSpace.dump_all(output: open('all_objects.json', 'w'))
脚本生成3个文件:
with_splat.json
包含有关 splatted 数组的数据
without_splat.json
包含关于非 splatted 数组的数据
all_objects.json
包含有关所有对象的数据(很多!)
with_splat.json
:(格式化)
{
"address": "0x7feb941289a0",
"type": "ARRAY",
"class": "0x7feb940972c0",
"length": 15,
"memsize": 160,
"flags": {
"wb_protected": true
}
}
without_splat.json
:(格式化)
{
"address": "0x7feb941287e8",
"type": "ARRAY",
"class": "0x7feb940972c0",
"length": 15,
"shared": true,
"references": [
"0x7feb941328d8"
],
"memsize": 40,
"flags": {
"wb_protected": true
}
}
如您所见,后一个数组确实消耗更少的内存(40 vs 160),但它也设置了 "shared": true
并且它引用了另一个位于内存地址 0x7feb941328d8
.[=25 的对象=]
让我们通过 jq 在 all_objects.json
中找到该对象:
$ jq 'select(.address == "0x7feb941328d8")' all_objects.json
{
"address": "0x7feb941328d8",
"type": "ARRAY",
"frozen": true,
"length": 15,
"memsize": 160,
"flags": {
"wb_protected": true
}
}
这就是与上面第一个数组具有完全相同 memsize 的实际数组。
请注意,此数组已设置 "frozen": true
。我假设 Ruby 在遇到数组文字时创建这些冻结数组。然后它可以在评估时创建便宜的(呃)共享数组。
在对我们的代码库进行优化时,我们尝试使用 bang 方法来减少有意义的对象分配,但我们在基准测试中观察到分配的对象数量减少了,但总体 memsize 增加了。
复制脚本:
# frozen_string_literal: true
require 'bundler/inline'
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
gem 'benchmark-memory', '0.1.2'
end
require 'benchmark/memory'
def with_bang(*methods)
methods.tap(&:flatten!)
end
def without_bang(*methods)
methods.flatten
end
Benchmark.memory do |x|
x.report("with_bang") { with_bang(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.report("without_bang") { without_bang(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.compare!
end
# Output
# Ruby version: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin19]
# INPUT: (:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_bang 160.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_bang 80.000 memsize ( 0.000 retained)
# 2.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# without_bang: 80 allocated
# with_bang: 160 allocated - 2.00x more
# INPUT: (:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_bang 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_bang 480.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# with_bang: 240 allocated
# without_bang: 480 allocated - 2.00x more
在我的实验中,我认为这是由于 splat 运算符转换为数组所致。以下是暗示我得出这个结论的脚本。
# frozen_string_literal: true
require 'bundler/inline'
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
gem 'benchmark-memory', '0.1.2'
end
require 'benchmark/memory'
def with_splat(*methods)
methods.flatten!
end
def without_splat
methods = [:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o]
methods.flatten!
end
Benchmark.memory do |x|
x.report("with_splat") { with_splat(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o) }
x.report("without_splat") { without_splat }
x.compare!
end
# Output
# Ruby version: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin19]
# INPUT: (:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_splat 160.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_splat 40.000 memsize ( 0.000 retained)
# 1.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# without_splat: 40 allocated
# with_splat: 160 allocated - 4.00x more
# INPUT: (:a, :b, :c, :d, :e, [:f, :g], :h, :i, :j, :k, :l, :m, :n, :o)
# Calculating -------------------------------------
# with_splat 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# without_splat 240.000 memsize ( 0.000 retained)
# 3.000 objects ( 0.000 retained)
# 0.000 strings ( 0.000 retained)
# Comparison:
# with_splat: 240 allocated
# without_splat: 240 allocated - same
我缺少什么来理解这种行为?为什么它会以这种方式运行?
谢谢!
编辑: 我已将新输入添加到包含嵌套数组的基准比较中。有了新的输入,我们看到了与之前基准测试不同的结果,我更加困惑了!
让我们更仔细地检查这两个数组:
require 'objspace'
def with_splat(*methods)
ObjectSpace.dump(methods, output: open('with_splat.json', 'w'))
end
def without_splat(methods)
ObjectSpace.dump(methods, output: open('without_splat.json', 'w'))
end
with_splat(:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o)
without_splat([:a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o])
ObjectSpace.dump_all(output: open('all_objects.json', 'w'))
脚本生成3个文件:
with_splat.json
包含有关 splatted 数组的数据without_splat.json
包含关于非 splatted 数组的数据all_objects.json
包含有关所有对象的数据(很多!)
with_splat.json
:(格式化)
{
"address": "0x7feb941289a0",
"type": "ARRAY",
"class": "0x7feb940972c0",
"length": 15,
"memsize": 160,
"flags": {
"wb_protected": true
}
}
without_splat.json
:(格式化)
{
"address": "0x7feb941287e8",
"type": "ARRAY",
"class": "0x7feb940972c0",
"length": 15,
"shared": true,
"references": [
"0x7feb941328d8"
],
"memsize": 40,
"flags": {
"wb_protected": true
}
}
如您所见,后一个数组确实消耗更少的内存(40 vs 160),但它也设置了 "shared": true
并且它引用了另一个位于内存地址 0x7feb941328d8
.[=25 的对象=]
让我们通过 jq 在 all_objects.json
中找到该对象:
$ jq 'select(.address == "0x7feb941328d8")' all_objects.json
{
"address": "0x7feb941328d8",
"type": "ARRAY",
"frozen": true,
"length": 15,
"memsize": 160,
"flags": {
"wb_protected": true
}
}
这就是与上面第一个数组具有完全相同 memsize 的实际数组。
请注意,此数组已设置 "frozen": true
。我假设 Ruby 在遇到数组文字时创建这些冻结数组。然后它可以在评估时创建便宜的(呃)共享数组。