Julia：一种从数组数组中获取矩阵的快速而优雅的方法

Question

有一个包含超过 10,000 对 Float64 值的数组数组。像这样：

v = [[rand(),rand()], ..., [rand(),rand()]]

我想得到一个包含两列的矩阵。可以用一个循环绕过所有的pair，看起来很麻烦，但是几分之一秒就给出了结果：

x = Vector{Float64}()
y = Vector{Float64}()
for i = 1:length(v)
    push!(x, v[i][1])
    push!(y, v[i][2])
end
w = hcat(x,y)

我在 this task 中找到的 permutedims(reshape(hcat(v...), (length(v[1]), length(v)))) 的解决方案看起来更优雅但完全挂起 Julia，需要重新启动会话。可能六年前是最优的，现在在大阵的情况下就不行了。有没有既紧凑又快速的解决方案？

Answer 1

我希望这对你来说足够简短有效：

 getindex.(v, [1 2])

如果你想要更简单的东西来消化：

[v[i][j] for i in 1:length(v), j in 1:2]

另外 hcat 解决方案可以写成：

permutedims(reshape(reduce(hcat, v), (length(v[1]), length(v))));

它不应该挂起你的 Julia（请确认 - 它对我有用）。

@Antonello：要理解为什么这有效，请考虑一个更简单的示例：

julia> string.(["a", "b", "c"], [1 2])
3×2 Matrix{String}:
 "a1"  "a2"
 "b1"  "b2"
 "c1"  "c2"

我正在广播一列 Vector ["a", "b", "c"] 和 1 行 Matrix [1 2]。重点是 [1 2] 是一个 Matrix。因此它使广播扩展行（由向量强制）和列（由 Matrix 强制）。要实现这种扩展，[1 2] 矩阵只有一行是至关重要的。现在清楚了吗？

Answer 2

您自己的示例非常接近于一个好的解决方案，但是通过创建两个不同的向量并重复使用 push! 做了一些不必要的工作。此解决方案类似，但更简单。它不像@BogumilKaminski 广播的 getindex 那样简洁，但速度更快：

function mat(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for i in eachindex(v)
        M[i, 1] = v[i][1]
        M[i, 2] = v[i][2]
    end
    return M
end

您可以在不损失性能的情况下进一步简化它，如下所示：

function mat_simpler(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for (i, x) in pairs(v)
        M[i, 1], M[i, 2] = x
    end
    return M
end

Answer 3

目前发布的各种解决方案的基准...

using BenchmarkTools
# Creating the vector
v = [[i, i+0.1] for i in 0.1:0.2:2000]

M1 = @btime vcat([[e[1] e[2]] for e in $v]...)
M2 = @btime getindex.($v, [1 2])
M3 = @btime [v[i][j] for i in 1:length($v), j in 1:2]
M4 = @btime permutedims(reshape(reduce(hcat, $v), (length($v[1]), length($v))))
M5 = @btime permutedims(reshape(hcat($v...), (length($v[1]), length($v))))

function original(v)
    x = Vector{Float64}()
    y = Vector{Float64}()
    for i = 1:length(v)
        push!(x, v[i][1])
        push!(y, v[i][2])
    end
    return hcat(x,y)
end
function mat(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for i in eachindex(v)
        M[i, 1] = v[i][1]
        M[i, 2] = v[i][2]
    end
    return M
end
function mat_simpler(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for (i, x) in pairs(v)
        M[i, 1], M[i, 2] = x
    end
    return M
end

M6 = @btime original($v)
M7 = @btime mat($v) 
M8 = @btime mat($v)

M1 == M2 == M3 == M4 == M5 == M6 == M7 == M8 # true

输出：

1.126 ms (10010 allocations: 1.53 MiB)       # M1
  54.161 μs (3 allocations: 156.42 KiB)      # M2
  809.000 μs (38983 allocations: 765.50 KiB) # M3
  98.935 μs (4 allocations: 312.66 KiB)      # M4
  244.696 μs (10 allocations: 469.23 KiB)    # M5
219.907 μs (30 allocations: 669.61 KiB)      # M6
34.311 μs (2 allocations: 156.33 KiB)        # M7
34.395 μs (2 allocations: 156.33 KiB)        # M8

请注意，基准测试代码中的美元符号只是为了强制 @btime 将向量视为局部变量。

Julia：一种从数组数组中获取矩阵的快速而优雅的方法

Julia: A fast and elegant way to get a matrix from an array of arrays

arrays

reshape

julia

arrayofarrays