通过实例列表扩展向量列表 - Julia

Expanding Vector List Each by List of Instances - Julia

我想用一个包含每个实例数的向量来扩展一个值向量。我想出了以下代码来完成这项工作,但似乎这是一种常见用法,所以我可能遗漏了一些东西。

valuelist = ["a","b","d","z"]
numberofinstance = [3,5,1,11]

valuevector = String[]
for i in 1:length(numberofinstance) 
  append!(valuevector , repeat([valuelist[i]], numberofinstance[i])) 
end

如果您可以使用一个包(基本上是一个标准库),您正在寻找的函数在 StatsBase.jl:

中称为 inverse_rle
julia> using StatsBase

julia> inverse_rle(valuelist, numberofinstance)
20-element Array{String,1}:
 "a"
 "a"
 "a"
 "b"
 "b"
 "b"
 "b"
 "b"
 "d"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"

julia> @btime inverse_rle($valuelist, $numberofinstance);
  76.799 ns (1 allocation: 240 bytes)

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

如果你想避免包裹,原则上你可以像这样广播repeat^(供电),

vcat(collect.(.^(valuelist, numberofinstance))...)

但我认为这相对难以解析,而且比 inverse_rle

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime vcat(collect.(.^($valuelist, $numberofinstance))...)
  472.615 ns (9 allocations: 800 bytes)

但是,由于 Julia 允许您编写快速循环,因此您可以轻松定义自己的简单函数。以下是比您的解决方案快得多(与implementation in StatsBase一样快):

function multiply(vs, ns)
   r = Vector{String}(undef, sum(ns))
   c = 1
   @inbounds for i in axes(ns, 1)
       for k in 1:ns[i]
           r[c] = vs[i]
           c += 1
       end
   end
   r
end

基准:

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime multiply($valuelist, $numberofinstance);
  76.469 ns (1 allocation: 240 bytes)